Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vivado HLS Update.

Similar presentations


Presentation on theme: "Vivado HLS Update."— Presentation transcript:

1 Vivado HLS Update

2 Vivado High-Level Synthesis: Accelerated IP Generation and Integration
C based IP Creation User Preferred System Integration Environment C, C++ or SystemC System Generator for DSP C Libraries Floating point math.h Fixed point OpenCV VHDL or Verilog Vivado IP Integrator Slide 11: HLS Libs and flow Time on foil: 2 mins Speaker Notes Title: Vivado HLS : Accelerate C Libraries and System IP Integration Today we have support today for user C, C++, SystemC to VHDL or Verilog generation With we add more users by providing access to application specific C libraries We have support for math.h (single and double precision floating-point) today We have 31 basic OpenCV video functions that go into production in and we will continue to add more functions. We are working on enabling our ecosystem and partners to provide OpenCV video functions for Xilinx programmable logic by leveraging HLS Next we will also begin offering some DSP functions (filters, FFTs and DDS) libraries in 2nd half of this year and will continue to drive market specific libraries Vivado HLS creates IP that can be used by our customers in their preferred Xilinx System IP integration environment Vivado HLS is creating IP with interfaces that users can use within IP Integrator to integrate IP systems Main Vivado HLS creates IP that can be added as a block inside System Generator for Vivado implementation flow. This block can be seamlessly used within the System Generator design environment as it will participate in data type, rate propagation and HDL netlist generation. Vivado HLS packages IP and creates IP-XACT based package that can be added to the Vivado IP catalog. This IP can then be used by user in a RTL design or in Vivado IP integrator Vivado HLS enables users to target 7 series FPGA or Zynq SoC (when available) in user preferred Xilinx design environment with Vivado implementation flow Key Take-a-way Vivado HLS created IP along with C libraries can be targeted for 7 Series FPGA or Zynq SoC in user preferred Xilinx design environment (RTL, IP Integrator and System Generator) In case audience asks both Vivado High-Level Synthesis and System Generator for DSP are available stand alone or as part of the ISEĀ® Design Suite DSP and System Editions and Vivado Design Suite System Edition. Vivado HLS device and implementation flow support with both System Edition and standalone license is shown below: Vivado HLS supports 7 series and Zynq devices with ISE Design Suite (IDS) DSP or System Edition license Vivado HLS supports all devices supported by ISE and Vivado with Vivado HLS standalone license Vivado IP Catalog Vivado RTL Integration

3 Vivado HLS Video Libraries
C Video Libraries Available within Vivado HLS header files hls_video.h library hls_opencv.h library Enable Migration of OpenCV Designs into Xilinx FPGA Libraries target real-time Full HD video processing Libraries support standard AXI4 Interfaces for easy system integration

4 Video Library: 12 New Functions
Video Data Modeling Linebuffer class Window class AXI4-Stream IO Functions AXIvideo2Mat Mat2AXIvideo OpenCV Interface Functions cvMat2AXIvideo AXIvideo2cvMat cvMat2hlsMat hlsMat2cvMat IplImage2AXIvideo AXIvideo2IplImage IplImage2hlsMat hlsMat2IplImage CvMat2AXIvideo AXIvideo2CvMat CvMat2hlsMat hlsMat2CvMat Video Functions AbsDiff Duplicate MaxS Remap AddS EqualizeHist Mean Resize AddWeighted Erode Merge Scale And FASTX Min Set Avg Filter2D MinMaxLoc Sobel AvgSdv GaussianBlur MinS Split Cmp Harris Mul SubRS CmpS HoughLines2 Not SubS CornerHarris Integral PaintMask Sum CvtColor InitUndistortRectifyMap Range Threshold Dilate Max Reduce Zero

5 C Test Bench: Interface Library
Interface Libraries convert to/from OpenCV image to HLS type HLS MAT format: synthesizable and AXI4 Stream support Standard OpenCV files, formats & types HLS Video Libraries #include "hls_opencv.h" //Top Level C Function int main (int argc, char** argv) { IplImage* src = cvLoadImage(INPUT_IMAGE); IplImage* dst = cvCreateImage(cvGetSize(src), src->depth, src->nChannels); AXI_STREAM src_axi, dst_axi; IplImage2AXIvideo(src, src_axi); image_filter(src_axi, dst_axi, src->height, src->width); AXIvideo2IplImage(dst_axi, dst); cvSaveImage(OUTPUT_IMAGE, dst); Convert to Xilinx AXI4 Video Stream The code here shows a top-level design, with two sub-functions to perform a sepia and then sobel filter operation. The entire image is passed in and out as arrays. An array is used to hold the data between the filter functions. [Animate x3] to explain details on how this is synthesized. Let’s look at the filter functions in detail. Function to Synthesize Convert Xilinx AXI4 Video Stream back to OpenCV types

6 C Function to Synthesize
HLS Video Library Functions Drop-in Replacement for OpenCV and provide High QoR #include "hls_video.h" #include "ap_axi_sdata.h"; //Top Level C Function for Synthesis void image_filter(AXI_STREAM& inter_pix, AXI_STREAM& out_pix, int rows, int cols) { //Create AXI streaming interfaces for the core RGB_IMAGE img_0(rows, cols); ..etc.. RGB_IMAGE img_5(rows, cols); RGB_PIXEL pix(50, 50, 50); #pragma HLS dataflow hls::AXIvideo2Mat(inter_pix, img_0); hls::Sobel(img_0, img_1, 1, 0); hls::SubS(img_1, pix, img_2); hls::Scale(img_2, img_3, 2, 0); hls::Erode(img_3, img_4); hls::Dilate(img_4, img_5); hls::Mat2AXIvideo(img_5, out_pix); } HLS Video & AXI Struct Libraries Convert Xilinx AXI4 Video Stream to HLS Mat data type The code here shows a top-level design, with two sub-functions to perform a sepia and then sobel filter operation. The entire image is passed in and out as arrays. An array is used to hold the data between the filter functions. [Animate x3] to explain details on how this is synthesized. Let’s look at the filter functions in detail. HLS Video functions are drop-in replacement for OpenCV function & provide high QoR Convert HLS Mat type to Xilinx AXI4 Video Stream

7 Application Note XAPP1167 Accelerating OpenCV Applications with Zynq using Vivado HLS Video Libraries Video Processing data types Compares Video Architectures Advantages of Video Streaming Review Video Interfaces Reference Design with source files and project directories Download XAPP1167 from Xilinx.com QuickTake: Leveraging OpenCV and High-Level Synthesis with Vivado

8 Accelerator AXI Interconnect
Zynq PS GP Port HLS Accelerator AXI4 Lite IP Control from ARM AXI4-Lite & GP Port High Throughput Access to Memory AXI4-Stream using AXI-DMA AXI4-Master The Accelerator is the master Data transfer between HLS IP blocks AXI4-Stream HLS Accelerator Zynq PS HP Port ACP Port AXI4 Stream AXI DMA Zynq PS External Memory Access : HP L2 Cache Access: ACP HLS Accelerator HP Port AXI4 Master ACP Port

9 IP Integrator Supported
IP Integrator Requires an Early Access License in Vivado HLS IP can be exported to IP Integrator Export to the Vivado IP Catalog (was previously called IP-XACT format) Data types supported: IPI can propagate Add to IP Catalog Vivado HLS IP Vivado IP Integrator (IPI) Export to Vivado IP Catalog Add IP block & connect up Supported with Two New Tutorials

10 HLS and System Generator IP shown inside IPI
HLS IP Integration IP Integrator (IPI) Public Release HLS Output Fully Supported in IPI Three Tutorials on using HLS IP inside IPI Two connect HLS IP to the Zynq PS; One connects HLS IP with Xilinx IP HLS IP Blocks are identified in IPI HLS and System Generator IP shown inside IPI

11 Improved Software Driver Support
Software Drivers are Created for AXI4-Lite interfaces Now includes support for Linux Systems Drivers are also now created for Vivado IP Catalog format Add all files to the software project: ifdef statements ensure automatic configuration Files are in the Drivers sub-directory

12 Enhanced Report File Easier to find hot-spots
The term throughput has been changed to Interval or Initiation Interval All reports and documentation Top-Level function Latency and Interval Latency and Interval for all instances at this level of hierarchy All loops and sub-loops at this level of hierarchy

13 Analysis Perspective A New Perspective for Design Analysis
Allows Interactive Analysis Module Hierarchy Hierarchical Summary and Navigation Performance View Scheduled operations. Loops : shown in Yellow are expandable and collapsible Modules: shown in Green open the view on sub-blocks Performance Profile Latency and Interval summary for this block

14 Hierarchical Navigation Operations, loops and functions
Performance View Hierarchical Navigation Loop Hierarchy Select operations and right-click to cross reference with the C source and HDL Operations, loops and functions Scheduled States

15 Resource summary for this block
Resource Analysis Resource View Scheduled operations associated with resource: anything on the same row shares the same resource Resource Profile Resource summary for this block

16 Analysis Perspective Tutorials
Fully Supported by Two New Tutorials Design Analysis Design Optimization

17 Index counter hardware is accurately sized
Assertion Support Assertions are supported for Synthesis Can be used to define bit-widths for synthesis Replaces the need for a Tripcount directive Without Assertions With Assertions SUM_X:for (i=0;i<=xlimit; i++) { X_accum += A[i]; X[i] = X_accum; } SUM_Y:for (i=0;i<=ylimit; i++) { Y_accum += B[i]; Y[i] = Y_accum; assert(xlimit<32); SUM_X:for (i=0;i<=xlimit; i++) { X_accum += A[i]; X[i] = X_accum; } assert(ylimit<16); SUM_Y:for (i=0;i<=ylimit; i++) { Y_accum += B[i]; Y[i] = Y_accum; The code here shows a top-level design, with two sub-functions to perform a sepia and then sobel filter operation. The entire image is passed in and out as arrays. An array is used to hold the data between the filter functions. [Animate x3] to explain details on how this is synthesized. Let’s look at the filter functions in detail. * Loop Latency: |Target II |Trip Count |Pipelined | |- SUM_X |1 ~ |no | |- SUM_Y |1 ~ |no | Loop Latency: |Target II |Trip Count |Pipelined | |- SUM_X |1 ~ |no | |- SUM_Y |1 ~ |no | Index counter hardware is accurately sized

18 Improved Tutorials Vivado HLS is now provided with 10 Tutorials
22 Labs which cover all aspects of Vivado HLS Tutorial Summary Design Introduction Basic walkthrough of GUI operations (Csim, Synth, RTL Sim, IP package) FIR C Validation C simulation and using the debugger Filter Window Interface Synthesis Explain design, port and AXI interface synthesis (simple HLS design to allow analysis of IO) Sorter Design Arbitrary Precision Ā Review of a floating point and fixed windowing algorithm Hamming Window Design Analysis Using the Analysis Perspective to optimize performance of multi-hierarchy, multi-loop design. DCT Design Optimization with Pipelining Improving performance using pipelining at loop and function level and impact of IO. Matrix Multiplier RTL Verification Verify and view trace files using Vivado Xsim and Modelsim (incl. Floating Point simulation) DUC Creating IP for an IP Integrator Design Connecting to an IP core using IPI Windower, FFT IP Core, Sorter Creating IP for a Zynq Design Connecting to Zyqn with IPI and integrating driver files into SDK design (interrupt handling etc). Accelerator Creating IP for a System Generator Design Packaging a design for Sys Gen and verifying IO in Sys Gen (connecting interfaces etc.) YUV

19 Improved AXI4 & SystemC Support
AXI4 Master, Streams and Lite protocols now supported Lite: Use the RESOURCE directive to assign ports (as C/C++) Stream: Use the RESOUCE directive on sc_fifo_in and sc_fifo_out ports Master: Use the AXI4M_bus_port class Difference between SystemC and Vivado AP types fully documented SystemC design no longer require to be explicitly specified The add_files -type option retired (and check-box in the GUI C/C++ or SystemC) AXI4 Master Interface Now supported on Array ports Array ports can be synthesized with ap_bus IO protocol AXI4M_bus_port<sc_fixed<32, 8> > bus_if;

20 RTL cosimulation of Floating Point Designs
The IEEE operators are now in the RTL simulation model This requires the Xilinx IEEE library is used when RTL-cosimulation is performed Auto Support provided: No Action Required SystemC RTL Verilog and VHDL using the Xilinx Vivado (Xsim) simulator Verilog and VHDL using the Mentor Graphics ModelSim simulator Verilog and VHDL using the Xilinx Isim simulator. All other 3rd party HDL simulators The libraries must be pre-compiled before simulating floating point designs Open Vivado and refer to : compile_simlib –help Note: this is Vivado, not Vivado HLS

21 DSP48 Adder Resource Adders supported for implementation in DSP48
Adders in the C code can be targeted to a AddSub_DSP RESOURCE Ensures the adder or subtractor is implemented in a DSP48 Resource Specification Targets the adder or subtractor to a DSP48 Resource (* USE_DSP48 = "YES" *) module adders_add_32ns_32ns_32_1_AddSub_DSP_0 (a, b, s); endmodule module adders_add_32ns_32ns_32_1( …) adders_add_32ns_32ns_32_1_AddSub_DSP_0 U1 ( .a( din0 ), .b( din1 ), .s( dout ));

22 DSP48 Adder Implementation
Adders /Subtractors Targeted to a DSP48 Solution 1 Solution 2

23 FFT and FIR IP in HLS The Xilinx FFT and FIR IP are available in Vivado HLS C simulates with a bit-accurate model Fully configurable within the C++ source code Pre-defined C++ structs allow the IP to be configured & accessed Supported only for C++ Implemented with templates High-Quality Implementation Same hardware as implemented by RTL versions of this IP Functionality fully described in Xilinx Documentation LogiCORE IP Fast Fourier Transform v9.0 (document PG109) LogiCORE IP FIR Compiler v7.1 (document PG149) Page 23

24 IP Examples Examples Included in Vivado HLS Release
Access from the Welcome Screen Or from C:\Xilinx\Vivado_HLS\2013.3\examples\design Assuming the standard PC install path Examples IP Designs 1024-point FFT and Inverse FFT (fixed point) Single FFT 1024-point (fixed point) FIR with 2 interleaved channels 3 FIRs connected in series (HB, HB, SRRC) Updating coefficients using FIR CONFIG channel SRRC (Square Root Raise Cosine) FIR filter Page 24

25 FFT Function Using the FFT Include the hls_fft.h library in the code
This defines the FFT and supporting structs and types Allows hls::fft to be instantiated in your code Use the STATIC_PARAM template parameter to parameterize the FFT The STATIC_PARAM template parameter defines all static configuration values The Library provides a pre-defined struct hls::ip_fft::params_t to perform this Optionally modify the default parameters by creating a new user defined STATIC_PARAM struct based on the default #include "hls_fft.hā€œ hls::fft<STATIC_PARAM> ( // Static Parameterization Struct INPUT_DATA_ARRAY, // Input data fixed or float OUTPUT_DATA_ARRAY, // Output data fixed or float OUTPUT_STATUS, // Output Status INPUT_RUN_TIME_CONFIGURATION); // Input Run Time Configuration Page 25

26 FIR Function Using the FIR
Include the hls_fir.h library in the code This defines the FIR and supporting structs and types Allows hls::FIR to be instantiated in your code Unlike the FFT, the FIR is instantiated as a class and executed with the run method Create the STATIC_PARAM template parameter to configure the FIR The STATIC_PARAM template parameter defines all static configuration values The library provides a pre-defined struct hls::ip_fir::params_t to perform this There are no default values for the Coefficients You Must Always create a user defined struct based on hls::ip_fir::params_t #include "hls_fir.hā€œ // Create an instance of the FIR static hls::FIR<STATIC_PARAM> fir1; // Static parameterization // Execute the FIR instance fir1 fir1.run(INPUT_DATA_ARRAY, // Input Data OUTPUT_DATA_ARRAY); // Output Data Page 26

27 Using the FFT and FIR IP FFT and FIR support pipelined implementations
The functions themselves cannot be pipelined They should be parameterized for pipelined operation The data arguments are always arrays These will be implemented as AXI4 Streams in the RTL By default, arrays are implemented as BRAM interfaces Recommendation Use these IP in regions where dataflow optimization is used This will auto-convert the input and output arrays into streaming arrays Alternatively, a Requirement: The input and output arrays must be marked as streaming using the command set_directive_stream (pragma STREAM) Page 27

28 Fixed Point Math Functions
Further support for math functions The hls_math.h library Now includes fixed-point functions for sin, cos and sqrt The sin and cos functions are all 32-bit ap_fixed<32,Int_Bit> Where Int_Bit specifies the number of integer bits The sqrt function is any width but must have a decimal point Cannot be all intergers or all bits The accuracy above is quoted with respect to the equivalent floating point version Function Type Accuracy (ULP) Implementation Style cos ap_fixed<32,I> 16 Synthesized sin sqrt ap_fixed<W,I> ap_ufixed<W,I> 1 Page 28

29 AXI4 Stream Interface: Ease of Use
Native Support for AXI4 Stream Interfaces Native = An AXI4 Stream can be specified with set_directive_interface No longer required to set the interface then add a resource This AXI4 Stream interface is part of the HDL after synthesis This AXI4 Stream interface is simulated by RTL co-simulation Interface Type ā€œaxisā€ is AXI4 Stream set_directive_interface –mode axis ā€œfooā€ portA Or #pragma HLS interface axis port=portA Page 29

30 Pre-2013.3 Approach to AXI Streams
#if 1 // Use New Method #pragma HLS interface axis port=portA #else // Or use old Method #pragma HLS interface ap_fifo port=portA #pragma HLS resource core=AXI4Stream variable=portA \ metadata="-bus_bundle Agroupā€œ #end Existing Functionality Deprecated BUT NOT REMOVED!! We don’t want to break existing designs Warning: If you use the method for adding AXI4 Streams before This is were you set the interface as a FIFO then add an AXI Resource You will get a FIFO interface in the RTL And the AXI4 Stream adapter is added during export_design Recommendation Change existing AXI4 Stream directives to use the INTERFACE directive Page 30

31 AXI4 Master Interface: Pipeline Support
Transaction involving an AXI4 Master Interface is now Pipelined Prior to this interface would not pipeline Each transfer was an ā€œatomicā€ process The for-loop/memcpy waits until a transfer completes before starting next transfer This was the limiting factor in the pipeline interval Improved performance in Accesses to an AXI master interface can now be pipelined The performance will be much better than before Further improvements in Existing limitations: Cannot configure the based address, infer bursts, reads and writes cannot be performed simultaneously (sequential only) We expect to get more performance in At that time we’ll publish statistics and make more noise about this feature Page 31

32 Enhanced Support for Exporting IP
Sys Gen and AXI Stream Interfaces Design with AXI Stream interfaces now be exported to System Generator The AXI Interfaces will be present and can be connected Previously, AXI interfaces were not supported in Sys Gen AXI Lite Drivers Software drivers are now included in the IP package When creating a local repository in SDK simply point to the IP package No need to manually copy files Further EoU enhancements coming Page 32

33 New Clang Front-end Vivado HLS has upgraded it’s front-end parser
Now using clang instead of gcc Provides 64-bit support on windows In addition this enables continued growth of features and functionality More optimizations possible, messages can reference line and column etc. Clang Side-effect: Different command options The new front-end does not support all gcc flags For example, -fpermissive is now ignored as this is not supported by clang If an option is not supported but provided, it will be ignored Clang Options: Clang Side-Effect: More strict Syntax Checking Some existing working designs may fail Not expected to occur often, but is possible Example –fpermissive workaround : memcpy(dest, src), if src is volatile pointer, cast it to a constant pointer to pass syntax checking Page 33

34 Design Hubs: Easier Access to Documentation
DocNav Designs Hubs Improved Ease-of-Use Find things faster Open Docs at the exact page High Level Synthesis Getting Started Videos Tutorials Key Concepts FAQs These and the solution center will be updated in the coming weeks Others such as ā€œDesigning with Videoā€ etc will be added Ideas for topics are welcome Standard Introduction Docs and Videos App Notes and Videos all grouped Page 34

35 Thank You


Download ppt "Vivado HLS Update."

Similar presentations


Ads by Google