M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 1/25 SDF: Parameterized & Interfaced Synchronous Dataflow for MPSoCs.

Slides:

Advertisements

Similar presentations

Professur für Technische Informatik A Self Distributing Virtual Machine for FPGA Multicores Klaus Waldschmidt J. W. Goethe-University Technische Informatik.

Advertisements

Embedded Systems & Parallel Programming P. Marwedel, Univ. Dortmund/Informatik 12 + ICD/ES, 2007 Universität Dortmund A view on embedded systems.

Program Slicing and Debugging Elton Alves Informatics Center Federal University of Pernambuco (UFPE) V Encontro Brasilieiro de Testes de Software (EBTS),

University of Houston So What’s Exascale Again?. University of Houston The Architects Did Their Best… Scale of parallelism Multiple kinds of parallelism.

11 1 Hierarchical Coarse-grained Stream Compilation for Software Defined Radio Yuan Lin, Manjunath Kudlur, Scott Mahlke, Trevor Mudge Advanced Computer.

Ngu, Texas StatePtolemy Miniconference, February 13, 2007 Flexible Scientific Workflows Using Dynamic Embedding Anne H.H. Ngu, Nicholas Haasch Terence.

Hierarchical Reconfiguration of Dataflow Graphs Stephen Neuendorffer UC Berkeley Poster Preview May 10, 2004.

February 12, 2009 Center for Hybrid and Embedded Software Systems Model Transformation Using ERG Controller Thomas H. Feng.

Synergistic Execution of Stream Programs on Multicores with Accelerators Abhishek Udupa et. al. Indian Institute of Science.

Visual Debugging Tools for Concurrent Models of Computation Elaine Cheong 15 May 2002 EE290N: Advanced Topics in System Theory.

5 th Biennial Ptolemy Miniconference Berkeley, CA, May 9, 2003 JHDL Hardware Generation Mike Wirthlin and Matthew Koecher

HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.

Niranjan Rao Julapelly Real-Time Scheduling [ Chapter 5.5]

EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)

Multi-core Programming Thread Profiler. 2 Tuning Threaded Code: Intel® Thread Profiler for Explicit Threads Topics Look at Intel® Thread Profiler features.

COLLABORATIVE EXECUTION ENVIRONMENT FOR HETEROGENEOUS PARALLEL SYSTEMS Aleksandar Ili´c, Leonel Sousa 2010 IEEE International Symposium on Parallel & Distributed.

German National Research Center for Information Technology Research Institute for Computer Architecture and Software Technology German National Research.

Orchestration by Approximation Mapping Stream Programs onto Multicore Architectures S. M. Farhad (University of Sydney) Joint work with Yousun Ko Bernd.

Communication Overhead Estimation on Multicores S. M. Farhad The University of Sydney Joint work with Yousun Ko Bernd Burgstaller Bernhard Scholz.

Industrial Excellence Center (IXC) Embedded Applications Software Engineering (EASE) Prof. Per Runeson.

Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France

Static Translation of Stream Programs S. M. Farhad School of Information Technology The University of Sydney.

Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.

1 UMR 6164 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES From a Configuration Management to a Cognitive Radio Management of SDR Systems Loïg.

Aravind Venkataraman. Topics of Discussion Real-time Computing Synchronous Programming Languages Real-time Operating Systems Real-time System Types Real-time.

Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.

PRET-OS for Biomedical Devices A Part IV Project.

C. André, J. Boucaron, A. Coadou, J. DeAntoni,

MILAN: Technical Overview October 2, 2002 Akos Ledeczi MILAN Workshop Institute for Software Integrated.

DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Novel, Emerging Computing System Technologies Smart Technologies for Effective Reconfiguration: The FASTER approach.

Presentation by Tom Hummel OverSoC: A Framework for the Exploration of RTOS for RSoC Platforms.

UNIX Unit 1- Architecture of Unix - By Pratima.

Profile Guided Deployment of Stream Programs on Multicores S. M. Farhad The University of Sydney Joint work with Yousun Ko Bernd Burgstaller Bernhard Scholz.

Orchestration by Approximation Mapping Stream Programs onto Multicore Architectures S. M. Farhad (University of Sydney) Joint work with Yousun Ko Bernd.

Flexible Filters for High Performance Embedded Computing Rebecca Collins and Luca Carloni Department of Computer Science Columbia University.

Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a.

Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.

1 Presentation Methodology Summary B. Golden. 2 Introduction Why use visualizations?  To facilitate user comprehension  To convey complexity and intricacy.

Software Systems Division (TEC-SW) ASSERT process & toolchain Maxime Perrotin, ESA.

Architectural Design Rewriting as Architectural Description Language R. Bruni A. LLuch-Lafuente U. Montanari E. Tuosto.

1 of 14 Lab 2: Design-Space Exploration with MPARM.

Static Translation of Stream Program to a Parallel System S. M. Farhad The University of Sydney.

Tuning Threaded Code with Intel® Parallel Amplifier.

Compiler Research How I spent my last 22 summer vacations Philip Sweany.

Daniele Lezzi Execution of scientific workflows on federated multi-cloud infrastructures IBERGrid Madrid, 20 September 2013.

“Temperature-Aware Task Scheduling for Multicore Processors” Masters Thesis Proposal by Myname 1 This slides presents title of the proposed project State.

Marilyn Wolf1 With contributions from:

Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.

TensorFlow– A system for large-scale machine learning

Chapter 4: Multithreaded Programming

Computer Architecture: Parallel Task Assignment

Dynamo: A Runtime Codesign Environment

For Massively Parallel Computation The Chaotic State of the Art

The Dataflow Interchange Format (DIF): A Framework for Specifying, Analyzing, and Integrating Dataflow Representations of Signal Processing Systems Shuvra.

“Temperature-Aware Task Scheduling for Multicore Processors”

Texas Instruments TDA2x and Vision SDK

Computing Resource Allocation and Scheduling in A Data Center

CSE-591 Compilers for Embedded Systems Code transformations and compile time data management techniques for application mapping onto SIMD-style Coarse-grained.

Compositionality in Synchronous Data Flow

Introduction to cosynthesis Rabi Mahapatra CSCE617

Chapter 4 Multithreading programming

From C to Elastic Circuits

System Concept Simulation for Concurrent Engineering

Dynamically Scheduled High-level Synthesis

Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.

Introduction to Operating Systems

Parallel Algorithm Models

Maria Méndez Real, Vincent Migliore, Vianney Lapotre, Guy Gogniat

Dynamic Neural Networks Joseph E. Gonzalez

Measurement 2 Measurement 3 Condition Monitoring Integration with Embedded Software and On Key Hardware’s embedded software/analysis tool and specialist.

Presentation transcript:

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 1/25 SDF: Parameterized & Interfaced Synchronous Dataflow for MPSoCs Runtime Reconfiguration Karol DESNOS, Julien HEULOT IETR, Rennes, France M ETODO Workshop - Madrid, October 07 th 2014

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 2/25 Introduction Software complexity Lines of code/chip x2 every 10 months Software productivity Lines of code/day x2 every 5 years Hardware complexity Transistors/chip x2 every 18 months Software Productivity Gap Source: ITRS & Hardware-dependent Software, Ecker et al., Springer Motivations log

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 3/25 PiMM: Parameterized and Interfaced dataflow Meta-Model

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 4/25 PiMM SDF Features: Parallelisms Predictable Developer-friendly Limited Expressivity PiMM Features: -More Expressive -Hierarchical & Compositional -Statically Parameterizable -Dynamically Reconfigurable Dataflow Model + PiMM elements of semantics A src snk P A P = (Dataflow Model) 1 ACB x2 AB 13Size h x2*Size Size B Size/N in out SetN N Size back feed Size Parameterized and Interfaced dataflow Meta-Model (PiMM) K. Desnos, M. Pelcat, J.-F. Nezan, S.S. Bhattacharyya, S. Aridhi. “PiMM: Parameterized and interfaced dataflow Meta-Model for mpsocs runtime reconfiguration” SAMOS13

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 5/25 N PiMM Read Config Size 4 Read Image Filter /NSize out Size in Size SetNb Slices Size Kernel /NSize 4 Send

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 6/25 N PiMM Read Config Size =4 4Size Read Image Filter /NSize out Size in Size SetNb Slices =2 Size Kernel /NSize 4 Send =2

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 7/25 PiMM: A Meta-Model in out + N 1 x N Σ N1 N + N 2 1 in out Join {N, N/2, N/4, …, 2, 0} {N, 0, …, 0} {0, N/2, N/4, …, 2, 0} {0, …, 0, 1} N

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 8/25 PiMM Predictability The more dynamism, the less predictability B 13 A 2 C 2 SDF DPN ACB B 1 A 2 C 2 Set_p p Compile Time After Configuration After execution Source: S. Neuendorffer and E. Lee, “Hierarchical reconfiguration of dataflow models” in MEMOCODE, =3 p

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 9/25 S PIDER : Synchronous Parameterized and Interfaced Dataflow Embedded Runtime

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 10/25 Existing Objective: improving multicore programming productivity S PIDER Proposition

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 11/25 S PIDER S PIDER Principle Jobs ParamsTimingsData Pool of data F IFO s Master tasks: -Run jobs -Map & Schedule -Manage graphs -Monitor & Trace Slave task: -Run jobs Slave Master

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 12/25 S PIDER : a Step-by-Step Example Actor States Execution Gantt Core 1 Core 2 Actor Not schedulable Actor Schedulable Actor Executable Actor Executed

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 13/25 S PIDER : a Step-by-Step Example Current Graph Read Config 4Size Read Image Filter Size4 Send Schedulable Not schedulable Execution Gantt Core 1 Master S PIDER Core 2 Read Config Size

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 14/25 Current Graph Read Config Size 4 Size Read Image Filter Size 4 Send Executable Not schedulable Execution Gantt Core 1 Master S PIDER Core 2 Read Config =2 S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 15/25 Current Graph Read Config 42 Read Image Filter 24 Send Size=2 Schedulable X1 Schedulable X2 Schedulable X1 Executed Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 16/25 Current Graph Read Config 42 Read Image Filter 1 24 Send 2 Filter 2 2 Size=2 Executable Schedulable Executed Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 17/25 Current Graph Read Config 42 Read Image 24 Send Size=2 N1N1 /N2 out in 2 SetNb Slices 1 2 Kernel 1 /N N2N2 2 out in 2 SetNb Slices 2 2 Kernel 2 /N 22 Executable Schedulable Executed Schedulable Not schedulable Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image SetNb Slices 2 SetNb Slices 1 Filter 1 Filter 2 S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 18/25 Current Graph Read Config 42 Read Image 24 Send Size=2 N1N1 /N2 out in 2 SetNb Slices 1 2 Kernel 1 /N N2N2 2 out in 2 SetNb Slices 2 2 Kernel 2 /N 22 Executed Schedulable Executed Executable Not schedulable Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image SetNb Slices 2 SetNb Slices 1 Filter 1 Filter 2 S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 19/25 Current Graph Read Config 42 Read Image 24 Send Size=2 N 1 =1 2 out in 2 SetNb Slices 1 2 Kernel N 2 =2 1 out in 2 SetNb Slices 2 2 Kernel 2 21 Executed Schedulable Executed Schedulable X1 Schedulable X2 Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image SetNb Slices 2 SetNb Slices 1 Master S PIDER Kernel 1 Filter 1 Filter 2 S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 20/25 Current Graph Read Config 42 Read Image 24 Send Size=2 N 1 =1 2 out in 2 SetNb Slices 1 2 Kernel N 2 =2 1 out in 2 SetNb Slices 2 2 Kernel Executed Executable Schedulable 1 Kernel Schedulable Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image SetNb Slices 2 SetNb Slices 1 Master S PIDER Kernel 1 Ker -nel 2-2 Ker -nel 2-1 Send Schedulable Filter 1 Filter 2 S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 21/25 Current Graph Read Config 42 Read Image 24 Send Size=2 N 1 =1 2 out in 2 SetNb Slices 1 2 Kernel N 2 =2 1 out in 2 SetNb Slices 2 2 Kernel Executed Scheduled Executed Executable 1 Kernel Executable Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image SetNb Slices 2 SetNb Slices 1 Master S PIDER Kernel 1 Ker -nel 2-2 Ker -nel 2-1 Send Filter 1 Filter 2 S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 22/25 Current Graph Read Config 42 Read Image 24 Send Size=2 N 1 =1 2 out in 2 SetNb Slices 1 2 Kernel N 2 =2 1 out in 2 SetNb Slices 2 2 Kernel Executed Executable Executed 1 Kernel Executed Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image SetNb Slices 2 SetNb Slices 1 Master S PIDER Kernel 1 Ker -nel 2-2 Ker -nel 2-1 Send Filter 1 Filter 2 S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 23/25 Current Graph Read Config 42 Read Image 24 Send Size=2 N 1 =1 2 out in 2 SetNb Slices 1 2 Kernel N 2 =2 1 out in 2 SetNb Slices 2 2 Kernel Executed 1 Kernel Executed Execution Gantt Core 1 Master S PIDER Core 2 Read Config Master S PIDER Read Image SetNb Slices 2 SetNb Slices 1 Master S PIDER Kernel 1 Ker -nel 2-2 Ker -nel 2-1 Send Filter 1 Filter 2 S PIDER : a Step-by-Step Example

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 24/25 About S PIDER S PIDER Implementation Linux and Bare metal Architectures Keystone 1 & 2 from Texas Instruments Comparison with OpenMP Better performance for asymmetric parallel work J. Heulot, M. Pelcat, K. Desnos, J.-F. Nezan, and S. Aridhi “SPIDER: A Synchronous Parameterized and Interfaced Dataflow-Based RTOS for Multicore DSPs” [E DERC 14]

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 25/25 Future Work

M ETODO – Oct. 7 th 2014 INSTITUT D’ÉLECTRONIQUE ET DE TÉLÉCOMMUNICATIONS DE RENNES 26/25 Questions ?