Presentation is loading. Please wait.

Presentation is loading. Please wait.

Riccardo Cattaneo, Christian Pilato, Gianluca C. Durelli, Marco D. Santambrogio and Donatella Sciuto Politecnico di Milano, Italy IEEE International Symposium.

Similar presentations


Presentation on theme: "Riccardo Cattaneo, Christian Pilato, Gianluca C. Durelli, Marco D. Santambrogio and Donatella Sciuto Politecnico di Milano, Italy IEEE International Symposium."— Presentation transcript:

1 Riccardo Cattaneo, Christian Pilato, Gianluca C. Durelli, Marco D. Santambrogio and Donatella Sciuto Politecnico di Milano, Italy IEEE International Symposium on Rapid System Prototyping – Montreal, Canada – October 4, 2013 SMASH: A Heuristic Methodology for Designing Partially Reconfigurable MPSoCs

2 Christian Pilato, Politecnico di Milano What is an FPGA? Hardware deviceHardware device that can be customized after the fabrication to execute a specific functionality –Distinct hardware blocks are “intrinsically” running in parallel on the device Heterogeneous grid of interconnected components look-up tables (LUTs), block rams (BRAMs), digital signal processors (DSPs), switch matrices, input/output blocks (IOBs) etc… partial reconfigurationPossibility to reuse resources by reconfiguring part of the logic at run time (partial reconfiguration) 2

3 Christian Pilato, Politecnico di Milano Heterogeneous SoCs with FPGAs Highly coupled heterogeneous systems  Zynq Platform: ARM Dual-Cortex A9 cores tightly coupled with a Xilinx Artix-7 FPGA  High speed, low latency reconfigurable interconnect 3 AVNet ZedBoard (Zynq7000-based dev board) Coarse Grain overview of Zynq7000 All-Programmable SoC

4 Christian Pilato, Politecnico di Milano Design Challenges and Motivation Hardware engineer needs to: –partition the application in blocks (partitioning) –determine which parts are better to be executed in hardware (mapping and scheduling) –generate the systems (architecture refinement) Partial reconfigurationPartial reconfiguration allows reusing the same logic across different tasks –More tasks can be ported in hardware –Significant overhead to be taken into account 4 The steps are strictly interdependent! INPUT SMASH

5 Christian Pilato, Politecnico di Milano SMASH: Proposed Methodology Design Space Exploration –determines the proper mapping and scheduling Architecture Refinement –customizes the architectural template to derive the corresponding platform 5

6 Christian Pilato, Politecnico di Milano Mapping and Scheduling Input: Task graph (DAG) Architectural Template –Identifies resources constraints Implementations –List of different trade-offs in terms of performance and resources 6 Output: Implementation and component for each task Order of execution

7 Christian Pilato, Politecnico di Milano Implementation vs. Component Each task can have multiple alternative implementations on the same component –Faster tasks usually require more resources Some tasks can share implementations to execute the same functionality multiple times –Hardware reuse: no reconfiguration is required Implementation is more related to functionality and resources Component is more related to where the task is actually executed –Processor or hardware module 7

8 Christian Pilato, Politecnico di Milano SMASH: Execution Overview 8 Simultaneous MApping and Scheduling Heuristic SMASH iteration Schedule trace Generate trace Evaluate metrics Store solution Termination? Return best solution Yes No

9 Christian Pilato, Politecnico di Milano Exploring Mapping and Scheduling Serial Generation Scheme (SGS)Exploration based on the Serial Generation Scheme (SGS) –Constructive approach –Constructive approach to better handle design constraints Decision is not taken if it would lead to a constraint violation Different combinationsDifferent combinations of mapping and scheduling –Each decision represents a mapping of a task with respect to an implementation and a processing element –The order of selection represents the priority values for resolving scheduling conflicts on the resources 9

10 Christian Pilato, Politecnico di Milano Ant Colony Optimization Ant Colony Optimization (ACO)Our proposed approach is based on Ant Colony Optimization (ACO) to limit unfeasible solutions –Cooperative behavior of the ants while searching trace –The ant has different possibilities at each step and takes stochastic decisions, composing a trace Stochastic principlesStochastic principles guarantee exploration (a probability is generated for each admissible decision at each step) Feed-backsFeed-backs guarantee the exploitation of good parts of the solutions 10

11 Christian Pilato, Politecnico di Milano Algorithm Overview Pseudo-code of the proposed ACO-based exploration: 11 Exploitation Exploitation: updating global information Mapping decision Exploration Exploration: generating trace

12 Christian Pilato, Politecnico di Milano Stochastic Selection Process d jiAt each decision point d, the probability to assign a candidate j (task/communication) to a proper implementation point i (implementation+processing element) is: Global information GGlobal information G: feedback information –Probability that the decision leads to a good solution Local heuristic LLocal heuristic L: problem-specific hint –“Adjusted” by the global heuristic if wrong i, jRoulette wheel and extraction of a combination i, j –Probability is generated iff the resources required by the resulting PEs can be satisfied by the architecture 12 global heuristic platform customization There is always the possibility of adding a new PE or reusing an existing one (platform customization) local heuristic

13 More about SMASH Simultaneous MApping and Scheduling Heuristic SMASH iteration Schedule trace Generate trace Evaluate metrics Store solution Termination? Return best solution Yes No 13

14 Christian Pilato, Politecnico di Milano Trace Generation and Evaluation complete traceEvaluation is performed only on the complete trace –Updated version of the original TG augmented with communications and reconfigurations Reconfiguration is taken into account from the early stages of the design process Possibility to include different evaluation methods –Analytical estimations vs. TLM simulations Decisions composing the best solution are reinforced best trace –As the time goes, the best trace is identified 14

15 Christian Pilato, Politecnico di Milano Scheduling Definition Input Task graph (DAG) Trace: ordered list of mapping decisions (task-component-implementation) Output Start/end time estimations for each task Goal Reduce total execution time 15 TaskComponentImplementation Ap1impl_0 Bp2impl_1 Cp1impl_2 Dp3impl_3

16 Christian Pilato, Politecnico di Milano Scheduling: Methodology Overview 16 SMASH scheduler Create extended task graph Actual scheduling (assign times) Evaluate Metrics Task graph and trace Extended task graph Metrics

17 Christian Pilato, Politecnico di Milano Extended TG: Communications 17 Adding explicit tasks based on the communication topology

18 Christian Pilato, Politecnico di Milano Extended TG: Reconfigurations A reconfiguration task is introduced iff: –Two processing tasks are mapped on the same component and –Their implementations are different, i.e., module cannot be reused Insertion of a reconfiguration task: –New edges are introduced from all WRITEs exiting the source processing task to the reconfiguration –New edges are introduced from the reconfiguration to all the READs entering the target processing task 18

19 Christian Pilato, Politecnico di Milano Extended TG: Reconfigurations 19 TaskComponentImplementation Ap1impl_0 Bp2impl_1 Cp1impl_2 Dp3impl_3

20 Christian Pilato, Politecnico di Milano Trace Evaluation Possibility to integrate different policies to generate the corresponding scheduling 20

21 Christian Pilato, Politecnico di Milano Architecture Refinement platform instanceActual platform instance is derived based on the resulting decisions –Hardware modules with only one task assigned are converted into static IP blocks –Hardware modules with more tasks assigned are represented as reconfigurable regions Integration with the generation of the run time manager to manage reconfigurations –Still work in progress and manually performed 21

22 Christian Pilato, Politecnico di Milano Experimental Evaluation Synthetic benchmarksSynthetic benchmarks (TGFF) –Focus on scalability of the approach –Possibility to evaluate different task graph patterns virtual platformsResulting systems (platform instance and extended task graph with mapping/scheduling decisions) converted into virtual platforms –Validation of the different solutions assuming correctness of the execution Synopsys Platform ArchitectSimulations performed with Synopsys Platform Architect –VPU performance annotations extracted from tasks’ implementations 22

23 Christian Pilato, Politecnico di Milano Experimental Setup Three different class of experiments: –Static: FPGA area is divided into a set of up to K S static IP cores (no partial reconfiguration) –Mixed: both IP cores and reconfigurable regions can be used, with an upper bound of K M IPs and R M reconfigurable regions. –Reconfigurable: architectures with no more than K R regions Reconfigurable regions can be also deployed as static cores in the final architecture if only one task is assigned to them 23

24 Christian Pilato, Politecnico di Milano Experimental Results staticmixedreconfigurable #TaskIPsRRsHW tasks #ReconfIPsRRsHW tasks #ReconfIPsRRsHW tasks #Reconf Small task graphs cannot benefit of reconfiguration Large task graphs are affected by communication overhead

25 Christian Pilato, Politecnico di Milano Conclusions and Future Work SMASH is an automated methodology to design reconfigurable systems –It determines the mapping and scheduling of the different tasks –It allows customizing the architectural template Future work –Integration of floorplanning procedures to compuate and validate physical constraints of the blocks –Automatic generation of the platform specification 25

26 Christian Pilato, Politecnico di MilanoEnd… 26


Download ppt "Riccardo Cattaneo, Christian Pilato, Gianluca C. Durelli, Marco D. Santambrogio and Donatella Sciuto Politecnico di Milano, Italy IEEE International Symposium."

Similar presentations


Ads by Google