HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNS: AN EVOLUTIONARY APPROACH Tesi di Laurea di: Christian Pilato Matr.n. 674373 Relatore: Prof.

Slides:



Advertisements
Similar presentations
HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNES: AN EVOLUTIONARY APPROACH Tesi di Laurea di: Christian Pilato Matr.n Relatore: Prof.
Advertisements

ECE-777 System Level Design and Automation Hardware/Software Co-design
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
VHDL Structural Architecture ENG241 Week #5 1. Fall 2012ENG241/Digital Design2 VHDL Design Styles Components and interconnects structural VHDL Design.
Modern VLSI Design 3e: Chapter 10 Copyright  2002 Prentice Hall Adapted by Yunsi Fei ECE 300 Advanced VLSI Design Fall 2006 Lecture 24: CAD Systems &
Berlin, Germany – January 21st, 2013 A2B: A F RAMEWORK FOR F AST P ROTOTYPING OF R ECONFIGURABLE S YSTEMS Christian Pilato, R. Cattaneo, G. Durelli, A.A.
FPGA Latency Optimization Using System-level Transformations and DFG Restructuring Daniel Gomez-Prado, Maciej Ciesielski, and Russell Tessier Department.
- 1 -  P. Marwedel, Univ. Dortmund, Informatik 12, 05/06 Universität Dortmund Hardware/Software Codesign.
RTL Processor Synthesis for Architecture Exploration and Implementation Schliebusch, O. Chattopadhyay, A. Leupers, R. Ascheid, G. Meyr, H. Steinert, M.
COE 561 Digital System Design & Synthesis Architectural Synthesis Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Courseware High-Level Synthesis an introduction Prof. Jan Madsen Informatics and Mathematical Modelling Technical University of Denmark Richard Petersens.
Process Scheduling for Performance Estimation and Synthesis of Hardware/Software Systems Slide 1 Process Scheduling for Performance Estimation and Synthesis.
Merging Synthesis With Layout For Soc Design -- Research Status Jinian Bian and Hongxi Xue Dept. Of Computer Science and Technology, Tsinghua University,
ECE Synthesis & Verification - Lecture 4 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Allocation:
HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.
Mahapatra-Texas A&M-Fall'001 Codesign Framework Parts of this lecture are borrowed from lectures of Johan Lilius of TUCS and ASV/LL of UC Berkeley available.
GanesanP91 Synthesis for Partially Reconfigurable Computing Systems Satish Ganesan, Abhijit Ghosh, Ranga Vemuri Digital Design Environments Laboratory.
Center for Embedded Computer Systems University of California, Irvine and San Diego SPARK: A Parallelizing High-Level Synthesis.
Torino (Italy) – June 25th, 2013 Ant Colony Optimization for Mapping, Scheduling and Placing in Reconfigurable Systems Christian Pilato Fabrizio Ferrandi,
Universität Dortmund  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Hardware/software partitioning  Functionality to be implemented in software.
(1) Introduction © Sudhakar Yalamanchili, Georgia Institute of Technology, 2006.
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNES: AN EVOLUTIONARY APPROACH Tesi di Laurea di: Christian Pilato Matr.n Relatore: Prof.
A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian.
CAD Techniques for IP-Based and System-On-Chip Designs Allen C.-H. Wu Department of Computer Science Tsing Hua University Hsinchu, Taiwan, R.O.C {
University of Michigan Electrical Engineering and Computer Science 1 Integrating Post-programmability Into the High-level Synthesis Equation* Scott Mahlke.
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
High Performance, Pipelined, FPGA-Based Genetic Algorithm Machine A Review Grayden Smith Ganga Floora 1.
A two-stage approach for multi- objective decision making with applications to system reliability optimization Zhaojun Li, Haitao Liao, David W. Coit Reliability.
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
1 H ardware D escription L anguages Modeling Digital Systems.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Hardware/Software Co-design Design of Hardware/Software Systems A Class Presentation for VLSI Course by : Akbar Sharifi Based on the work presented in.
A Graph Based Algorithm for Data Path Optimization in Custom Processors J. Trajkovic, M. Reshadi, B. Gorjiara, D. Gajski Center for Embedded Computer Systems.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
3 rd Nov CSV881: Low Power Design1 Power Estimation and Modeling M. Balakrishnan.
- 1 - EE898_HW/SW Partitioning Hardware/software partitioning  Functionality to be implemented in software or in hardware? No need to consider special.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
MILAN: Technical Overview October 2, 2002 Akos Ledeczi MILAN Workshop Institute for Software Integrated.
C OMPARING T HREE H EURISTIC S EARCH M ETHODS FOR F UNCTIONAL P ARTITIONING IN H ARDWARE -S OFTWARE C ODESIGN Theerayod Wiangtong, Peter Y. K. Cheung and.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
Kanpur Genetic Algorithms Laboratory IIT Kanpur 25, July 2006 (11:00 AM) Multi-Objective Dynamic Optimization using Evolutionary Algorithms by Udaya Bhaskara.
Introduction to VHDL Simulation … Synthesis …. The digital design process… Initial specification Block diagram Final product Circuit equations Logic design.
ECE-C662 Lecture 2 Prawat Nagvajara
High Performance Embedded Computing © 2007 Elsevier Chapter 7, part 3: Hardware/Software Co-Design High Performance Embedded Computing Wayne Wolf.
A High-Level Synthesis Flow for Custom Instruction Set Extensions for Application-Specific Processors Asia and South Pacific Design Automation Conference.
Multi-objective Topology Synthesis and FPGA Prototyping Framework of Application Specific Network-on-Chip m Akram Ben Ahmed Xinyu LI, Omar Hammami.
Digital Design Using VHDL and PLDs ECOM 4311 Digital System Design Chapter 1.
Meenakshi Kaul, Vinoo Srinivasan, Sriram Govindarajan, Iyad Ouaiss, and Ranga Vemuri University of Cincinnati
Advanced SW/HW Optimization Techniques for Application Specific MCSoC m Yumiko Kimezawa Supervised by Prof. Ben Abderazek Graduate School of Computer.
Custom Computing Machines for the Set Covering Problem Paper Written By: Christian Plessl and Marco Platzner Swiss Federal Institute of Technology, 2002.
HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNS: AN EVOLUTIONARY APPROACH Tesi di Laurea di: Christian Pilato Matr.n Relatore: Prof.
Genetic algorithms: A Stochastic Approach for Improving the Current Cadastre Accuracies Anna Shnaidman Uri Shoshani Yerach Doytsher Mapping and Geo-Information.
CML Path Selection based Branching for CGRAs ShriHari RajendranRadhika Thesis Committee : Prof. Aviral Shrivastava (Chair) Prof. Jennifer Blain Christen.
POLITECNICO DI MILANO A SystemC-based methodology for the simulation of dynamically reconfigurable embedded systems Dynamic Reconfigurability in Embedded.
1 Comparative Study of two Genetic Algorithms Based Task Allocation Models in Distributed Computing System Oğuzhan TAŞ 2005.
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
FPGA: Real needs and limits
IP – Based Design Methodology
Reconfigurable Computing
ECE-C662 Introduction to Behavioral Synthesis Knapp Text Ch
VHDL Introduction.
Architecture Synthesis
Presentation transcript:

HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNS: AN EVOLUTIONARY APPROACH Tesi di Laurea di: Christian Pilato Matr.n Relatore: Prof. Fabrizio FERRANDI Correlatore: Ing. Antonino TUMEO Politecnico di Milano

Summary 2 Outlines  High-Level Synthesis  Proposed methodology  Experimental results  Some further extensions…  Conclusion and future works

Output: register-transfer level (RTL) design in a hardware description language (e.g. SystemC, VHDL and Verilog) Goal: minimize objectives (area, latency, etc.) High-Level Synthesis – Problem description 3 High-Level Synthesis Three main sub-tasks: 1.operation scheduling: when operations start their execution 2.resource allocation and binding: where operations are executed, where values are stored and how elements are interconnected. 3.controller synthesis: which operations are issued Behavioral specification Design constraints Resource Library Datapath & Controller Objectives Scheduling AllocationBinding Controller Synthesis High-Level Synthesis tool Inputs:  behavioral description (in C language)  library of different types of resources  set of constraints “High-Level Synthesis means going from an algorithmic level specification of a behaviour of a digital system to a register level structure that implements that behavior”. McFarland, et al., Proc. IEEE, February 1990.

High-Level Synthesis – Problem description 4 What are the problems? All the sub-tasks are NP-complete: no efficient algorithms Interconnections have to be considered: up to 80% of final area All the tasks are closely interdependent Most of information are available only at the end of the synthesis Genetic algorithms Try non-deterministic approaches with feedback information Multi-objective optimization: reducing to single-objective (weighted average) is not efficient Non-dominated Sorting Genetic Algorithm (NSGA-II) K. Deb, S. Agrawal, A. Pratab, and T. Meyarivan, “A Fast and Elitist Multi-Objective Genetic Algorithm: NSGA- II,” Proceedings of the Parallel Problem Solving from Nature VI Conference, pp. 849–858, 2000.

High-Level Synthesis and Design Space Exploration 5 The proposed methodology

Experimental results 6 Development framework Integrated in the PandA framework  an open-source C++ framework covering different aspects of the hardware-software design of embedded systems Evolutionary computation with Open BEAGLE framework Functional validation Comparison between Verilog and C simulations Estimation model validation Comparison between estimations and logic synthesis values average error equal 4.02 % standard deviance equal 2.82 % maximum error less than 10 % These values can be effectively used as fitness values

Experimental results 7 Design Space Exploration validation Population size of individuals, evolving up to a maximum of 200 generations  the best trade-off between overall execution time and solution quality. Considerations: It takes into account all elements in the design solution It can cover a good number of trade-offs between the fastest solution and the minimal area solution Better approach than existing tools to deal with area constraints Paper accepted for publication at International Symposium on Systems, Architectures, MOdeling and Simulation (SAMOS), Samos, Greece, July 2007 Title: “An Evolutionary Approach to Area-Time Optimization of FPGA designs”

Some extensions… 8 Some features just provided… Paper submitted to IEEE Congress of Evolutionary Computation (CEC) 2007, Singapore, September Title: “Fitness Inheritance In Evolutionary and Multi-Objective High Level Synthesis” Weighted clique covering: in register allocation to reduce interconnections An higher weight is assigned to compatibility edge when the two values involve the same functional units Clique covering on a weighted graphs; results show a further reduction of overall area up to 10%. Fitness inheritance: to reduce overall execution time A fraction of expensive real evaluations is substituted with an estimation based on similar evaluated individuals It is able to reduce overall execution time over by 25% No substantial difference in the final Pareto-optimal solution

Conclusion and future works 9 The main contributions from this thesis are: An high-level synthesis flow from C specifications to HDL descriptions Integration of a model for fast estimation of synthesis results Design space exploration with a genetic algorithm: It takes into account all elements composing the design solution High fitting with real values Multi-objective concurrent optimization Future works: Optimize the results coming from the synthesis flow Further reduce the overall execution time of the proposed methodology Refine the estimation model and specialize it for different targets

Christian PILATO Matr. n Thank you!

The proposed High-Level Synthesis flow 5 High-Level Synthesis Flow The proposed flow is organized as follows:  From C to intermediate representation from GIMPLE to produce graph representation  High-Level Synthesis Flow 1. Partial binding and Scheduling 2. Finite State Machine creation 3. Register allocation 4. Interconnection allocation 5. Performance and area estimations  From data structures to intermediate representation in form of graph  From intermediate representation to Hardware Description Language (e.g. Verilog) ready for logic synthesis

1. Partial binding and Scheduling 6 Partial Binding and Scheduling Partial binding: force an operation to be executed on a selected functional unit instance β (+1) = A technique introduced to partially control the final area occupation It can affect scheduling, register allocation and interconnection allocation Scheduling: assign a starting control step to each operation to be executed Many scheduling algorithms are able to support partial binding (Integer Linear Programming formulation, list based algorithm, etc.) Different solutions based on the selected algorithm

2-3. Finite State Machine creation and Register allocation 7 FSM and Register allocation Scheduling gives information about concurrent operations. This information is useful for controller synthesis and register allocation State Transition Graph (STG), based on Moore-FSM model, is created on scheduled specification  It represents control flow and concorrent operations  Conditional operations create bifurcation based on evaluated conditional values Register allocation: allocate elements to store values across cycle step boundaries. A compiler approach has been implemented on STG:  Liveness analysis based on dataflow equations  Interference graph based on liveness information  Different heuristics to minimize number of registers

4-5. Interconnection allocation and result estimations 8 The final steps… *: C. Brandolese, W. Fornaciari, and F. Salice. “ An Area Estimation Methodology for FPGA Based Designs at SystemC-Level ”, DAC '04: Proceedings of the 41st annual conference on Design automation, pp. 129 – 132, Interconnection allocation: allocate elements to interconnect the hardware components Mux-based architecture: port swapping for commutative operations Glue logic: represent logic netlist to decode commands and select inputs Truth tables based on signals from controller The RTL structural description is now available and it considers all elements. Objective values could be retrieved from logic synthesis too slow! Estimation model: perform fast estimations of objective values. Area is difficult to be estimated Updated and used an existing area model*

Design Space Exploration by Genetic Algorithm 10 Problem dependent elements Chromosome encoding Each operation in the specification has a gene to represent a feasible partial binding Genes are added to represent algorithms used to perform high-level synthesis steps: scheduling, register allocation and interconnection optimization Fitness Evaluation Information from chromosome about partial binding and algorithms are used to perform a synthesis flow. Objective values are estimated using the proposed model

Design Space Exploration by Genetic Algorithm 11 Problem independent elements Generic operators common operators (crossover and mutation) used without modifications: no unfeasible chromosomes can be created. If the gene changed by operators is related to: operation: a new binding constraint for that operation. algorithm: a different algorithm to solve the related synthesis step Initial population created by random or starting from some interesting points to explore around them. Solution ranking ranking into different levels according to their fitness values. accelerated using the fast-non-dominated-sort algorithm available in the NSGA-II