Presentation is loading. Please wait.

Presentation is loading. Please wait.

Workshop - November 2011 - Toulouse Toulouse, J.LACHAIZE (Astrium) High Level Synthesis.

Similar presentations


Presentation on theme: "Workshop - November 2011 - Toulouse Toulouse, J.LACHAIZE (Astrium) High Level Synthesis."— Presentation transcript:

1 Workshop - November 2011 - Toulouse Toulouse, J.LACHAIZE (Astrium) High Level Synthesis

2 Application to industrial case studies: Astrium Global SoC spec. SoC Architecture Functional validation SW Performance validation C/C++/ASM Functionality Functionality + timing Instruction Set Simulator System requirements Platform assembly Metrics HLS System Properties HW Properties SW Properties TLMLT TLMAT Software Co-simulation/Co-emulation Silicon Software Device execution HLS Traffic generators Metrics IP-Xact SoC Header generation RTL Software Requirement traceability

3 Overview  Algorithm  Process  C translation  C adaptation to GAUT  First iteration  Simulation

4 Algorithm

5 Process Reference Model : MATLAB code Manual transformation of MATLAB to C code Validation based on 3 reference cases Output comparison (bit-accurate objective) Identification of C functions that can yield better performance in HW Synthesis of C code (GAUT) Testbench generation (GAUT) Test of generated code (Modelsim®) Iteration on IO control

6 Algorithm -> HLS Validation Intermediate results required Validity criteria (computation precision)

7 HLS for architecture exploration Metrics (area, performance) – 20% pessimistic - Usable for for tradeoffs Help for bit accurate arithmetic migration (ac_type/sc_type) HLS requires to consider any IO architecture bottlenecks HLS incremental refinement try/test loop: heuristic approach Allows to measure latency introduced by pipelining Separation of the processing process and the IO constraints

8 Academic tool Public Domain (CECILL-B License) Open source and free Dedicated to DSP applications Data-dominated algorithm Inputs : Algorithm written in bit-accurate C/C++ Bit-accurate integer and fixed-point from Mentor Graphics Synthesis constraints (data average throughput, clock, I/O constraints…) Outputs : RTL Architecture written in VHDL (IEEE 1076) Simulation model in SystemC Automated Test-bench generation High Level Synthesis: GAUT Global SoC spec. SoC Architecture Functional validation SW Performance validation C/C++/ASM Functionalit y Functionality + timing Instruction Set Simulator System requirements Platform assembly Metrics HLS System Properties HW Properties SW Properties TLM LT TLM AT Software Co-simulation/Co-emulation Silicon Software Device execution HLS Traffic generators Metrics IP-Xact SoC Header generation RTL Software Requirement traceability

9 Example : Static detection Conversion of RGB pixel (i,j) into pseudo-chromatic value. Generation of a bit mask corresponding to the validity of the pixel (i,j) if the value is in a pre-defined range. Pseudo-code Val = Pix.R + Pix.G + Pix.B If LowThreshold <= Val <= HighThreshold then Mask = 1 else Mask = 0

10 Original C Code #include "socket.h" #include /* for printf */ void staticDetection (T_PARAMS *params,T_IMAGE image, T_MASK mask) { int ligne,colonne ; int tmp ; #if DEBUG_STATICDETECTION printf("--> StaticDetection \n") ; #endif for (colonne=0;colonne imSizeC;colonne++) { for (ligne=0;ligne imSizeL;ligne++) { tmp = MAT(image,ligne,colonne).R+MAT(image,ligne,colonne).G+MAT(image,ligne,colonne).B ; if ((tmp >= params->lowThres) && (tmp highThres)) { MAT(mask,ligne,colonne) = (unsigned char)1 ; } else { MAT(mask,ligne,colonne) = 0 ; #if DEBUG_STATICDETECTION printf("StaticDetection Invalid pixel (%d,%d) \n",ligne,colonne) ; #endif }

11 C transformation GAUT needs « main » function Inputs transmitted by value Outputs transmitted by address ( *) Remove non synthesisable code (e.g. printf) Complete path for include files (absolute or relative to the C file) corrected with GAUT last release => Use of pragma

12 GAUT C Code #include «../include/socket.h" #ifndef GAUT #include #endif #ifdef GAUT static const hiThres = 1022 ; static const lowThres =2 ; #endif #ifdef GAUT int main (T_PARAMS *params,T_IMAGE image, T_MASK mask) #else void staticDetection (T_PARAMS *params,T_IMAGE image, T_MASK mask) #endif { int ligne,colonne ; int tmp ; …. }

13 Graph Generation

14 Code impact

15 VHDL simulation

16 Synthesis improvement Use of IO constraints to select the bus used by the data, allows serialization or parallelisation

17 Single Data bus

18 Parallel data buses

19 GAUT REX 1/2 Mainly limited to pipelinable algorithm (which restrict the usage in the industrial world) Strong effort to write synthesisable C code Strong effort to write the constraint file Some limitations into the HDL generation Missing functions (division) Not critical and partially corrected Generation of non synthesisable VHDL (for edge-case) Multiple outputs are not synchronous

20 GAUT REX 2/2 Powerful academic tool for Data Flow Graph Good support of Xilinx targets ATC18RHA targeting on-going study Generated component directly pluggable on a bus (not used) Generated HDL efficiency (# gates and speed) Further evolution in the frame of project P CDFG support IO communication pattern instantiation

21 HLS usage Fast algorithm exploration (ease iteration) Area and achievable speed estimation HW/SW partitioning trade-offs Improve early confidence in the design Delayed the design freeze Data exchange characterisation Help identifying the factorisable operators (use of intermediate representation)

22 HLS highlights Ease IP maintenance/evolution. Requires both hardware competence and software skills. It’s quite natural to transform Matlab to C then to RTL but it is a trap. The C implementation is optimized for SW not for HW and the SW optimization could be counterproductive. Not optimal for data handling (FIFO, Cache, prefetch) Manager : no gain in the development process but the exploration process and avoid some dead-end

23 GAUT expected enhancement Pipelining needs to be externally handled Valid signal after the pipeline fill Synchronization of multiple outputs on the same edge Control of loops unrolling Automatic loop unrolling under constraint: No manual override Selection of loop to be unrolled Output timing constraint propagation Add traceability between the C code and generated VHDL code Map C variable which HDL datapath Operators set extension (e.g: fixed-point division) IO management (optimal data organization,...) Would require some additional work for tool qualification (documentation, validation of generated HDL, IO interface configuration/control,...)

24 Perspectives ASTRIUM is convinced by the interest of the HLS The quality of the tools depends mainly of the basic library The Control Data Flow is mandatory Majority of algorithm embed multiple phases. GAUT is accessible for preliminary studies and its performance are comparable with some commercial tools. A strong cooperation is foreseen with the Lab-STICC to improve the tool in on-going project.

25 Thank you for your attention ? ? ? Any questions ?


Download ppt "Workshop - November 2011 - Toulouse Toulouse, J.LACHAIZE (Astrium) High Level Synthesis."

Similar presentations


Ads by Google