Presentation is loading. Please wait.

Presentation is loading. Please wait.

DSP for FPGA SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic.

Similar presentations


Presentation on theme: "DSP for FPGA SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic."— Presentation transcript:

1 DSP for FPGA SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic

2 Objectives Comparison between PDSP and FPGA Virtex II Pro Altera Stratix FPGA Stratix DSP Block and its configuration Altera design flow

3 What Is an FPGA? Field Programmable Gate Array Device that Has a Regular Architecture (Set of Blocks) that Can Be Programmed for Various Functions “Glue” Logic Customizable Hardware Solution Configurable Processors

4 Why Use FPGAs in DSP Applications? 10x More DSP Throughput Than DSP Processors –Parallel vs. Serial Architecture Cost-Effective for Multi-Channel Applications Flexible Hardware Implementation Single-Chip Solution –System (Hardware/Software) Integration Benefits FPGA Software Embedded Processor FPGA DSP System Software DSP

5 MAC Can implement hundreds of MAC functions in an FPGA Parallel implementation allows for faster throughput – 200 Tap FIR Filter would need 1 clock cycle per sample 1-8 Multipliers  Needs looping for more than 8 multiplications Needs multiple clock cycles because of serial computation  200 Tap FIR Filter would need 25+ clock cycles per sample with an 8 MAC unit processor MAC High Speed DSP Processor High Level of Parallel Processing in FPGA DSP Processors vs. FPGAs

6 100 - Complete Hardware Implementation Performance (MMACs/sec) 600 - Embedded Processors Embedded Processors Hardware Acceleration New! Extending Range of Altera Reconfigurable DSP Solutions

7 DataProgrammable DSP ProcessorsReconfigurable DSP Benefits Easy to Use Programmed Via C-Code or Assembly Fast Development Time Easy to Use Programmed via C-Code, Assembly, or HDL Efficient for Recursive Algorithms Using DSP IP Cores Higher Levels of Integration Weaknesses Fixed Architecture Inefficient for Highly Recursive Algorithms Unless Hardware Accelerated Potential Bus Bottlenecks Other Devices (FPGAs) Often Used on Board for Other Functions Longer Development Time (But Getting Shorter!) Comparison of DSP Devices

8 Objectives Comparison between PDSP and FPGA Virtex II Pro Altera Stratix FPGA Stratix DSP Block and its configuration Altera design flow

9 Stratix EP1S10 [2]

10

11

12 TriMatrix™ Memory [1] M512 Blocks M4K Blocks M-RAM Dedicated External Memory Interface Look-Up Schemes Packet & Cell Buffering Cache More Bits For Larger Memory Buffering More Data Ports for Greater Memory Bandwidth Small FIFOs Shift Register Rake Receiver Correlator FIR Filter Delay Line Header / Cell Storage Channelized Functions ATM cell–packet processing Nios Program Memory Packet / Data Storage Nios Program Memory System Cache Video Frame Buffers Echo Canceller Data Storage 512 bits per block + parity 4 Kbits per block + parity 512 Kbits per block + parity

13 Memory Bandwidth Summary Stratix Device Family [1] DeviceTotal RAM Bits M-RAM Blocks M4K Blocks M512 Blocks Maximum Bandwidth (Mbps) EP1S10920,448160941,245,024 EP1S201,669,2482821942,096,928 EP1S251,944,57621382242,894,400 EP1S303,317,18441712953,750,192 EP1S403,423,74441833844,384,800 EP1S605,215,10462925746,762,528 EP1S807,427,52093647678,784,720

14 Logic Element (LE) [2] Sync Load & Clear Logic D DATA 4-Input LUT Register Control Signals Register Chain Input Register Chain Output LUT Chain Output data1 data2 data3 data4 cin Row, Column & DirectLink Routing Local Routing Note: 1)Functional Diagram Only. Please See Datasheet for more Details. 2)Addnsum & data1 connected via XOR logic LUT Chain Input Register Feedback addnsub (2)

15 Dynamic Arithmetic Mode Sync Load & Clear Logic D DATA Register Control Signals Register Chain Input Register Chain Output data1 data2 addnsub Row, Column & DirectLink Routing Local Routing Note: Functional Diagram Only. Please See Datasheet for more Details. Carry-Out Logic Carry-In Logic LAB Carry-In Carry-In0 Carry-In1 Sum Calculator Carry Calculator data3 Carry-In0 Carry-In1 Carry-Out1 Carry-Out0

16 Logic Array Blocks (LAB) [2] 10 LEs Local Interconnect LAB-Wide Control Signals LE1 LE2 LE3 LE4 LE5 LE6 LE7 LE8 LE10 LE9 4 4 4 4 4 4 4 4 4 4 Control Signals Local Interconnect 30 LAB Input Lines 10 LE Feedback Lines

17 Avalon Switch Fabric Contents Avalon Switch Fabric provides the following to peripherals it connects –Data-Path Multiplexing –Address Decoding –Wait-State Generation –Dynamic Bus Sizing –Interrupt-Priority Assignment –Latent Transfer Capabilities –Streaming Read and Write Capabilities Avalon Switch Fabric tailors transactions to the characteristic of peripherals that are attached

18 SOPC Design Example DMA Controller With Streaming Control Port (Slave) Read Port (Master – Streaming) Write Port (Master – Streaming) UARTInstruction Memory 32- bit Data path Avalon Switch Fabric Avalon Tri-State Bridge VGA Controller External FLASH 1 MB 16-bit Datapath External SRAM 256 KB 32-bit Datapath Inst Master Data Master CPU 32 Bit Data Memory 32-bit Data path Allows for Masters and Slaves to communicate without knowledge of each others interface details

19 Data Path Multiplexing & Slave Arbitration DMA Controller With Streaming Control Port (Slave) Read Port (Master – Streaming) Write Port (Master – Streaming) UARTInstruction Memory 32- bit Data path Avalon Switch Fabric Arbiter Avalon Tri-State Bridge VGA Controller External FLASH 1 MB 16-bit Datapath External SRAM 256 KB 32-bit Datapath Inst Master Data Master CPU 32 Bit Data Memory 32-bit Data path MUX 1.Data-Path Multiplexing 2- Slave Arbitration 3- Address Decoding

20 Objectives Comparison between PDSP and FPGA Virtex II Pro Altera Stratix FPGA Stratix DSP Block and its configuration Altera design flow

21 DSP Blocks Eight 9 × 9 bit multipliers Four 18 × 18 bit multipliers One 36 × 36 bit multiplier

22 DSP Blocks (cont.) The DSP block consists of A multiplier block An adder/subtractor/accumulator block A summation block An output interface Output registers Routing and control signals

23 Stratix DSP Blocks High Performance Dedicated Multiplier Circuitry –18x18 Functions at 280 MHz Variable Operand Widths with Full Precision Outputs –9x9 (8 Max.) –18x18 (4 Max.) –36x36 (1 Max.) Add, Accumulate or Subtract –Signed & Unsigned Operations –Dynamically Change between Add & Subtract –Supports DSP Requirements Including Complex Numbers + Optional Pipelining Output Register Unit Output Multiplexer + -  Input Register Unit

24 DSP Block for 18 x 18-bit Mode

25 Shift Register Chain

26 Adder/Output Block

27 Time-Domain Multiplexed FIR Filters

28 Operation of TDM Filter

29

30 DSP Block –Reduces LE Usage –Reduces Routing Congestion –Reduces Power –Maintains Performance 90% of your problems are hidden under the surface! 18 X X 36 + 18 36 + + 38 SAVES 652 ROUTING NETS! Resource Savings with DSP Blocks

31 Design Flow

32 Design Flow Overview 1)Create Design in Simulink Using Altera Libraries 2)Simulate in Simulink 3)Add SignalCompiler to Model 4)Create HDL Code & Generate Testbench 5)Perform RTL Simulation 6)Synthesize HDL Code & Place & Route 7)Program Device 8)Signal Tap II Logic Analyzer

33 Step 1- Create Design in Simulink Using Altera Libraries Drag & Drop Library Blocks into Simulink Design & Parameterize Each Block

34 Parameterization of IP Megacores

35 Step 2 - Simulate in Simulink

36 Step 3 - Add “Signal Compiler” to Model to Generate HDL code APEX20K/E/C APEX II Stratix & Stratix GX Cyclone & ACEX 1K Mercury FLEX10K & FLEX 6000 DSP Boards Speed vs. Area Message Window Leonardo Spectrum Synplify Quartus II Testbench Generation

37 Step 4 - Create HDL Code & Generate Testbench AltrFir32.vhd AltrFir32.mdl Enable "Generate Stimuli for VHDL Testbench" Button

38 HDL Code Generation

39 DSP Builder Report File Lists All Converted Blocks –Port Widths –Sampling Frequencies –Warnings & Messages

40 Step 5 – Perform RTL Simulation ( ModelSim ) 1) Set working directory (File => Change Directory) 2) Run TCL file (Tools => Execute Macro)

41 Perform Verification ModelSim vs Simulink

42 Step 6 - Synthesize HDL & Place & Route – Synthesis Leonardo Spectrum Synplify Quartus II – Quartus II Fitter

43 Step 7 – Program Device Download Design to DSP Development Kits

44 Stratix DSP Development Board 40-Pin Connectors for Analog Devices Texas Instruments Connectors on Underside of Board Mictor-Type Connectors for HP Logic Analyzers MAX 7000 Device Analog SMA Connectors D/A Converters A/D Converters Prototyping Area Nios Expansion Prototype Connector

45 Stratix DSP Board – Key Features Stratix EP1S25F780C5 Device (Starter Version) Stratix EP1S80B956C7 Device (Professional Version) Analog I/O –Two 12-bit, 125 MHz A/D Converters –Two 14-bit, 165 MHz D/A Converters Digital I/O –Two 40-pin Connectors for Analog Devices A/D Converter Evaluation Boards –Connector for TI TMS320 Cross-Platform Daughter Card –3.3V Expansion/Prototype Headers –RS-232 Serial Port Memory –2 Mbytes of 7.5-ns Synchronous SRAM –32 Mbytes of FLASH

46 Step 8 - SignalTap II Logic Analyzer Embedded Logic Analyzer –Downloads into Device with Design –Captures State of Internal Nodes –Uses JTAG for Communication

47 SignalTap II Logic Analyzer Imported Data Imported Plot Analysis of Imported Data


Download ppt "DSP for FPGA SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic."

Similar presentations


Ads by Google