Presentation is loading. Please wait.

Presentation is loading. Please wait.

ADC Board VHDL Firmware development for Mona Lisa

Similar presentations


Presentation on theme: "ADC Board VHDL Firmware development for Mona Lisa"— Presentation transcript:

1 ADC Board VHDL Firmware development for Mona Lisa
Roy Wastie

2 Overview Introduction ADC Board Hardware Blocks
Basic FPGA Architectures Xilinx ISE 10.1 Tool Flow USB Algorithm VHDL

3 Introduction Applications of FPGAs include digital signal processing, software-defined radio, aerospace and defense systems, ASIC prototyping, medical imaging, computer vision, speech recognition, cryptography, bioinformatics, computer hardware emulation & glue logic for PCBs.

4 ADC Board

5 External Clock & Trigger
Hardware Blocks FPGA Memory controller External Clock & Trigger SDRAM Memory FIFO 16 channel ADC USB Interface FPGA DAQ

6 Basic FPGA Architectures

7 Objectives After completing this module, you will be able to:
Identify the basic architectural resources Look at Virtex™-II FPGA List the differences between the Virtex and Spartan Families Virtex: Virtex-II, Virtex-II Pro Spartan: Spartan™-3, and Spartan-3E devices List the new and enhanced features of the new Virtex-4 device family \\

8 Outline Overview Case Study: Virtex-II New Architectures Summary
Logic Resources I/O Resources Memory Clocking New Architectures Virtex Family Spartan Family Summary

9 Overview All Xilinx FPGAs contain the same basic resources
Logic Resources Slices (grouped into CLBs) Contain combinatorial logic and register resources Memory Multipliers Interconnect Resources Programmable interconnect IOBs Interface between the FPGA and the outside world Other resources Global clock buffers Boundary scan logic

10 Virtex-II Archtecture
First FPGA Device to include embedded multipliers I/O Blocks (IOBs) Block SelectRAM™ resource Programmable interconnect Dedicated multipliers Configurable Logic Blocks (CLBs) Virtex™-II architecture’s core voltage operates at 1.5V Clock Management (DCMs, BUFGMUXes)

11 Outline Overview Virtex-II New Architectures Summary Logic Resources
I/O Resources Memory Clocking New Architectures Virtex-II Family Spartan Family Summary

12 Basic Building Block Configurable Logic block Slices contain logic resources and are arranged in two colums A switch matrix provides access to general routing resources Local routing provides connection between slices in the same CLB, and it provides routing to neighboring CLBs COUT COUT Switch Matrix BUFT BUF T Slice S3 Slice S2 SHIFT Slice S1 Slice S0 Local Routing CIN CIN Virtex-II CLB contains four slices

13 Basic Building Blocks Each slice has four outputs
Simplified Slice Structure Each slice has four outputs Two registered outputs, two non-registered outputs Two BUFTs associated with each CLB, accessible by all 16 CLB outputs Carry logic runs vertically, up only Two independent carry chains per CLB Slice 0 LUT PRE Carry D Q CE CLR LUT Carry D PRE CE Q CLR

14 The Slice The next few slides discuss the slice features
Detailed Structure The next few slides discuss the slice features LUTs MUXF5, MUXF6, MUXF7, MUXF8 (only the F5 and F6 MUX are shown in this diagram) Carry Logic MULT_ANDs Sequential Elements

15 Combinatorial logic Also called Function Generators (FGs)
Boolean logic is stored in Look-Up Tables (LUTs) A B C D Z 1 . Also called Function Generators (FGs) Capacity is limited by the number of inputs, not by the complexity Delay through the LUT is constant Combinatorial Logic A B C D Z

16 Storage Elements Two in each slice; eight in each CLB
Can be implemented as either flip-flops or latches D CE PRE CLR Q FDCPE S R FDRSE LDCPE G _1 Two in each slice; eight in each CLB Inputs come from LUTs or from an independent CLB input Separate set and reset controls Can be synchronous or asynchronous All controls are shared within a slice Control signals can be inverted locally within a slice

17 Dedicated Logic Multiplexer Logic Carry Chains Multiplier AND gate
FPGAs contain built-in logic for speeding up logic operations and saving resources Multiplexer Logic Connect Slices and LUTs Carry Chains Speed up arithmetic operations Multiplier AND gate Speed up LUT-based multiplication Shift Register LUT LUT-based shift register Embedded Multiplier 18x18 Multiplier

18 Multiplexer Logic Dedicated MUXes provided to connect slices and LUTs
MUXF8 combines the two MUXF7 outputs (from the CLB above or below) CLB F5 F8 Slice S3 MUXF6 combines slices S2 and S3 F5 F6 Slice S2 MUXF7 combines the two MUXF6 outputs F5 F7 Slice S1 MUXF6 combines slices S0 and S1 F5 F6 Slice S0 MUXF5 combines LUTs in each slice

19 Carry Chains Dedicated carry chains speeds up arithmetic operations
Simple, fast, and complete arithmetic Logic Dedicated XOR gate for single-level sum completion Uses dedicated routing resources All synthesis tools can infer carry logic COUT SLICE S0 SLICE S1 Second Carry Chain To S0 of the next CLB To CIN of S2 of the next CLB First Carry Chain SLICE S3 SLICE S2 CIN CLB

20 Multiplier AND Gate Highly efficient multiply and add implementation
Speed up LUT-based multiplication Highly efficient multiply and add implementation Earlier FPGA architectures require two LUTs per bit to perform the multiplication and addition The MULT_AND gate enables an area reduction by performing the multiply and the add in one LUT per bit LUT A CY_MUX S DI CO CI CY_XOR MULT_AND A x B LUT B LUT

21 Shift Register LUT (SRL16CE)
The shift register LUT saves from having to use dedicated registers Dynamically addressable serial shift registers Maximum delay of 16 clock cycles per LUT (128 per CLB) Cascadable to other LUTs or CLBs for longer shift registers Dedicated connection from Q15 to D input of the next SRL16CE Shift register length can be changed asynchronously by toggling address A LUT D D Q CE CE CLK D Q CE D Q CE Q LUT D Q CE A[3:0] Q15 (cascade out)

22 Embedded Multiplier Blocks
Saves from having to use LUTs to implement multiplications and increases performance 18-bit twos complement signed operation Optimized to implement Multiply and Accumulate functions Multipliers are physically located next to block SelectRAM™ memory Data_A (18 bits) 18 x 18 Multiplier 4 x 4 signed 8 x 8 signed 12 x 12 signed 18 x 18 signed Output (36 bits) Data_B (18 bits)

23 Outline Overview Virtex-II Architecture New Architectures Summary
Logic Resources I/O Resources Memory Clocking New Architectures Virtex Family Spartan Family Summary

24 IOB Element Input path IOB Output path
Connects the FPGA design to external components Input path Two DDR registers Output path Two 3-state enable DDR registers Separate clocks and clock enables for I and O Set and reset signals are shared IOB Input Reg DDR MUX Reg OCK1 ICK1 Reg Reg 3-state OCK2 ICK2 Reg DDR MUX OCK1 PAD Reg Output OCK2

25 SelectIO Standard FPGA I/O pins can be configured to support various standards Allows direct connections to external signals of varied voltages and thresholds Optimizes the speed/noise tradeoff Saves having to place interface components onto your board Differential signaling standards LVDS, BLVDS, ULVDS LDT LVPECL Single-ended I/O standards LVTTL, LVCMOS (3.3V, 2.5V, 1.8V, and 1.5V) PCI-X at 133 MHz, PCI (3.3V at 33 MHz and 66 MHz) GTL, GTLP and more!

26 Digital Controlled Impedance (DCI)
DCI provides Output drivers that match the impedance of the traces On-chip termination for receivers and transmitters DCI advantages Improves signal integrity by eliminating stub reflections Occurs when the termination resistor is too far away from the end of the transmission line With DCI, the resistors are as close to the input buffer or output buffer as possible, thereby eliminating stub reflections Reduces board routing complexity and component count by eliminating external resistors Eliminates the effects of temperature, voltage, and process variations by using an internal feedback circuit

27 Outline Overview Virtex-II Architecture New Architectures Summary
Logic Resources I/O Resources Memory Resources Clocking Resources New Architectures Virtex Family Spartan Family Summary

28 Distributed RAM Synchronous write Asynchronous read
Uses a LUT in a slice as memory Synchronous write Asynchronous read Accompanying flip-flops can be used to create synchronous read RAM and ROM are initialized during configuration Data can be written to RAM after configuration Emulated dual-port RAM One read/write port One read-only port 1 LUT = 16 RAM bits RAM16X1S D LUT WE WCLK A0 O A1 A2 A3 RAM32X1S RAM16X1D D D WE WE Slice WCLK WCLK A0 O A0 SPO LUT A1 A1 A2 A2 A3 A3 A4 DPRA0 DPO DPRA1 DPRA2 LUT DPRA3

29 Block RAM Up to 3.5 Mb of RAM in 18-kb blocks True dual-port memory
Embedded blocks of RAM arranged in columns Up to 3.5 Mb of RAM in 18-kb blocks Synchronous read and write True dual-port memory Each port has synchronous read and write capability Different clocks for each port Supports initial values Synchronous reset on output latches Supports parity bits One parity bit per eight data bits Situated next to embedded multiplier for fast multiply-accumulate operations 18-kb block SelectRAM memory DIA DIPA ADDRA WEA ENA SSRA DOA CLKA DOPA DIB DIPB ADDRB WEB ENB SSRB DOB CLKB DOPB

30 Outline Overview Virtex-II Architecture New Architectures Summary
Logic Resources I/O Resources Memory Resources Clock Resources New Architectures Virtex Family Spartan Family Summary

31 Global Routing Sixteen dedicated global clock multiplexers
Eight on the top-center of the die, eight on the bottom-center Driven by a clock input pad, a DCM, or local routing Global clock multiplexers provide the following: Traditional clock buffer (BUFG) function Global clock enable capability (BUFGCE) Glitch-free switching between clock signals (BUFGMUX) Up to eight clock nets can be used in each clock region of the device Each device contains four or more clock regions For more information about clock distribution and clock regions, refer to the “Clocking Techniques” module in the Advanced FPGA Implementation course.

32 Digital Clock Manager (DCM)
Up to twelve DCMs per device Located on the top and bottom edges of the die Driven by clock input pads DCMs provide the following: Delay-Locked Loop (DLL) Digital Frequency Synthesizer (DFS) Digital Phase Shifter (DPS) Up to four outputs of each DCM can drive onto global clock buffers All DCM outputs can drive general routing

33 Outline Overview Virtex-II Architecture New Architectures Summary
Slice Resources I/O Resources Memory Clocking New Architectures Virtex Architectures Spartan Architectures Summary

34 Virtex Architectures Latest Families include Virtex-II Pro Virtex-4
Built for high-performance applications Latest Families include Virtex-II Pro Virtex-4 Virtex-5

35 Virtex-II Pro Architecture
Contains embedded Processors and Multi-Gigabit Transceivers Advanced FPGA Logic – 99k logic cells High performance True Dual-port RAM - 8 Mb SelectIO™- Ultra Technology I/O XtremeDSP Functionality - Embedded multipliers RocketIO™ and RocketIO X High-speed Serial Transceivers 622 Mbps to Gbps PowerPC™ Processors 400+ MHz Clock Rate - 2 XCITE Digitally Controlled Impedance - Any I/O DCM™ Digital Clock Management - 12 130 nm, 9 layer copper in 300 mm wafer technology

36 Virtex-4 Family Optimized for logic, Embedded, and Signal Processing LX FX SX Resource Logic Memory DCMs DSP Slices SelectIO RocketIO PowerPC Ethernet MAC 14K–200K LCs 12K–140K LCs 23K–55K LCs 0.9–6 Mb 0.6–10 Mb 2.3–5.7 Mb 4–12 4–20 4–8 32–96 32–192 128–512 240–960 240–896 320–640 N/A 0–24 Channels N/A N/A 1 or 2 Cores N/A N/A N/A 2 or 4 Cores

37 RocketIO™ Multi-Gigabit Transceivers 622 Mbps–10.3 Gbps
Virtex-4 Architecture RocketIO™ Multi-Gigabit Transceivers 622 Mbps–10.3 Gbps Smart RAM New block RAM/FIFO Xesium Clocking Technology 500 MHz Advanced CLBs 200K Logic Cells Tri-Mode Ethernet MAC 10/100/1000 Mbps XtremeDSP™ Technology Slices x18 GMACs 1 Gbps SelectIO™ ChipSync™ Source synch, XCITE Active Termination PowerPC™ 405 with APU Interface 450 MHz, 680 DMIPS

38 Virtex-5 Family Optimized for logic, Embedded, and Signal Processing
Virtex™-5 Platforms LX LXT SXT FXT Logic Logic/Serial DSP/Serial Emb./Serial Logic On - chip RAM DSP Capabilities Parallel I/Os Serial I/Os PowerPC® Processors

39 Virtex-5 Architecture New Enhanced
Most Advanced High-Performance Real 6LUT Logic Fabric 36Kbit Dual-Port Block RAM / FIFO with Integrated ECC 550 MHz Clock Management Tile with DCM and PLL PCI Express® Endpoint Block System Monitor Function with Built-in ADC SelectIO with ChipSync Technology and XCITE DCI Advanced Configuration Options Next Generation PowerPC® Embedded Processor 25x18 DSP Slice with Integrated ALU RocketIO™ Transceiver Options Low-Power GTP: Up to 3.75 Gbps High-Performance GTX: Up to 6.5 Gbps Tri-Mode 10/100/1000 Mbps Ethernet MACs

40 Outline Overview Virtex-II Architecture New Architectures Summary
Slice Resources I/O Resources Memory Clocking New Architectures Virtex Architectures Spartan Architectures Summary

41 The Spartan-3 Family Spartan-3
Built for high volume, low-cost applications 18x18 bit Embedded Pipelined Multipliers for efficient DSP Configurable 18K Block RAMs + Distributed RAM Spartan-3 Bank 0 Bank 1 Bank 2 Bank 3 4 I/O Banks, Support for all I/O Standards including PCI, DDR333, RSDS, mini-LVDS Up to eight on-chip Digital Clock Managers to support multiple system clocks

42 Spartan-3 Family Smaller process = lower core voltage Logic resources
Based upon Virtex-II Architecture – Optimized for Lower Cost Smaller process = lower core voltage .09 micron versus .15 micron Vccint = 1.2V versus 1.5V Logic resources Only one-half of the slices support RAM or SRL16s (SLICEM) Fewer block RAMs and multiplier blocks Clock Resources Fewer global clock multiplexers and DCM blocks I/O Resources Fewer pins per package No internal 3-state buffers Support for different standards New standards: 1.2V LVCMOS, 1.8V HSTL, and SSTL Default is LVCMOS, versus LVTTL

43 SLICEM and SLICEL Each Spartan™-3 CLB contains four slices
Similar to the Virtex™-II Slices are grouped in pairs Left-hand SLICEM (Memory) LUTs can be configured as memory or SRL16 Right-hand SLICEL (Logic) LUT can be used as logic only Left-Hand SLICEM Right-Hand SLICEL COUT COUT Switch Matrix Slice X1Y1 Slice X1Y0 SHIFTIN Slice X0Y1 Slice X0Y0 Fast Connects CIN SHIFTOUT CIN

44 Multiple Domain-optimized Platforms

45 Spartan-3E Features More gates per I/O than Spartan-3
Removed some I/O standards Higher-drive LVCMOS GTL, GTLP SSTL2_II HSTL_II_18, HSTL_I, HSTL_III LVDS_EXT, ULVDS DDR Cascade Internal data is presented on a single clock edge 16 BUFGMUXes on left and right sides Drive half the chip only In addition to eight global clocks Pipelined multipliers Additional configuration modes SPI, BPI Multi-Boot mode

46 Outline Overview Virtex-II Architecture New Architectures Summary
Logic Resources I/O Resources Memory Clocking New Architectures Virtex Family Spartan Family Summary

47 Summary Virtex-II contains Logic, I/O, Memory, and Clocking Resources
Virtex-II Logic Resources CLBs which are made up of slices that contain LUTs – can be configured as shift registers or memory Storage elements (flip-flops or latches) Dedicated Logic for speeding up logic operations Embedded Multipliers Virtex-II I/O resources SelectIO™ enables communication across multiple standards DCI reduces board complexity be reducing component count Virtex™-II memory resources Distributed SelectRAM™ resources and distributed SelectROM (uses CLB LUTs) 18-kb block SelectRAM resources Virtex-II Clocking Resources Dedicated global clock lines Digital Clock Managers

48 Summary Virtex-II is the basis for future architectures Virtex Architectures designed for high-performance applications Latest families include Virtex-II Pro, Virtex-4, and Virtex-5 Spartan architectures designed for low-cost, high-volume applications Latest families include Spartan-3, Spartan-3E, Spartan-3A, Spartan-3AN, Spartan-3A DSP

49 Where Can I Learn More? Documentation On-Demand Learning
 Documentation  Devices On-Demand Learning  Training Recorded e-learning (REL) On-demand webcasts Demo Open your browser and go to In the top navigation bar, click Support. Shows relevant areas of the support website.

50 Xilinx Tool Flow

51 Objectives After completing this module, you will be able to:
List the steps of the Xilinx design process Implement and simulate an FPGA design by using default software options

52 Outline Overview ISE Foundation Summary Lab 1: Xilinx Tool Flow Demo

53 Xilinx Design Flow Implement Create Code/ Schematic HDL RTL Simulation
Plan & Budget Implement Functional Simulation Synthesize to create netlist Translate Map Place & Route Attain Timing Closure Timing Simulation Generate BIT File Configure FPGA

54 Design Entry Plan and budget
Create designs in HDL or Schematic Plan and budget Whichever method you use, you will need a tool to generate an EDIF or NGC netlist to bring into the Xilinx implementation tools Popular synthesis tools include: Synplify, Precision, FPGA Compiler II, and XST Tools available to assist in design entry Architecture Wizard, CORE Generator™ system, and StateCAD tools Simulate the design to ensure that it works as expected! Plan & Budget Create Code/ Schematic HDL RTL Simulation Functional Simulation Synthesize to create netlist . . .

55 Synthesis Generate a netlist file After coding up your HDL code, you will need a tool to generate a netlist (NGC or EDIF) Xilinx Synthesis Tool (XST) included Support for Popular Third Party Synthesis tools: Synplify, Leonardo Spectrum

56 Implementation Consists of three phases
Process a netlist file Consists of three phases Translate: Merge multiple design files into a single netlist Map: Group logical symbols from the netlist (gates) into physical components (slices and IOBs) Place & Route: Place components onto the chip, connect the components, and extract timing data into reports Access Xilinx reports and tools at each phase Timing Analyzer, Floorplanner, FPGA Editor, XPower Netlist Generated From Synthesis . Implement . . . Translate Map Place & Route .

57 Configuration Once a design is implemented, you must create a file that the FPGA can understand This file is called a bitstream: a BIT file (.bit extension) The BIT file can be downloaded Directly into the FPGA Use a download cable such as Platform USB To external memory device such as a Xilinx Platform Flash PROM Must first be converted into a PROM file

58 Online Software Manuals
See Development System Reference Guide for Flow Diagrams

59 Timing Closure

60 Outline Overview ISE Foundation Summary Lab 1: Xilinx Tool Flow Demo

61 ISE Project Navigator Enter Designs Access to synthesis tools
Xilinx ISE Foundation is built around the Xilinx Design Flow Enter Designs Access to synthesis tools Including third-party synthesis tools Implement your design with a simple double-click Fine-tune with easy-to-access software options Download Generate a bitstream Configure FPGA using iMPACT

62 Entering Designs Select source type
New Source Wizard available to assist with design entry Select source type Design Entry Schematic HDL source (VHDL and Verilog) Design Entry Tools Architecture Wizard Core Generator Chipscope State Diagram Embedded Processor Simulation Test Bench VHDL, Verilog and waveform

63 Synthesizing Designs Generate a netlist file using XST (Xilinx Synthesis Technology) 1 Synthesis Processes and Analysis Access report View Schematics (RTL or Technology) Check syntax Generate Post-Synthesis Simulation Model Highlight HDL Sources 2 Double-click to Synthesize

64 Implementing Designs Process netlist generated from synthesis
Implement a design Translate Access reports Floorplan design Map Analyze timing Manually place components Generate simulation model Place & Route Manually place & route components And more 1 Highlight HDL Sources 2 Double-click to Synthesize

65 The Design Summary Displays Design Data
Quick View of Reports, Constraints Project Status Device Utilization Design Summary Options Performance and Constraints Reports

66 Simulating Designs Verify the design with the ISE Simulator
Add a test bench VHDL, Verilog, or Xilinx waveform file Perform a Behavioral Simulation Use UNISIM/UniMacro library when FPGA primitives are instantiated in the design Use XilinxCoreLib library when IP cores are instantiated in the design Perform a timing simulation Use Xilinx SIMPRIM library when FPGA primitives are instantiated in the design SmartModels Simulation library for both functional and timing simulation of Xilinx Hard-IP such as PPC, PCIe, GT, TEMAC. 1 Select simulation type 2 Highlight test bench 3 Double-click to simulate

67 Configuring FPGAs Configure FPGAs from computer
Generate PROM files and download to devices using iMPACT 1 Configure FPGAs from computer Use iMPACT to download bitstream from computer to FPGA via Xilinx download cable (ie. Platform USB) Configure FPGAs from External Memory Xilinx Platform Flash Use iMPACT to generate PROM file and download to PROM using Xilinx download cable Generic Parallel PROM Use iMPACT to generate PROM file - no support for programming Compact Flash (Xilinx System ACE required) Use iMPACT to generate SysACE file - no support for programming Highlight source file 2 Double-click to generate .bit There are two ways to program an FPGA, through a PROM device where you must generate a file that the PROM programmer can understand, or directly from the computer using the iMPACT configuration tool. 3 Double-click to invoke iMPACT programming tools

68 Outline Overview ISE Summary Lab 1: Xilinx Tool Flow

69 Review Questions What are the phases of the Xilinx design flow?
What are the components of implementation, and what happens at each step? What are two methods of programming an FPGA?

70 Answers What are the phases of the Xilinx design flow?
Plan and budget, create code or schematic, RTL simulation, synthesize, functional simulation, implement, timing closure, timing simulation, and BIT file creation What are the components of implementation, and what happens at each step? Translate: merges multiple design files into one netlist Map: groups logical symbols into physical components Place & Route: places components onto the chip and connects them What are two methods of programming an FPGA? Directly from Computer From external memory device

71 Summary Implementation means more than Place & Route
Xilinx provides a simple pushbutton tool to guide you through the Xilinx design process

72 Where Can I Learn More? Complete design flow tutorials
 Documentation  Tutorials On implementation: Development System Reference Guide  Documentation  Software Manuals Documentation may also be installed on your local computer On simulation: ISIM Online Help On-Demand Learning  Training Recorded e-learning (REL) On-demand webcasts

73 Outline Overview ISE Summary Lab 1: Xilinx Tool Flow

74 USB

75 USB2 Peer to Peer. Host computer is master.
480Mbits/s Mb/s theoretical 30MB/s readily achievable in Bulk transfer mode. The speeds USB 1.0 Low & Full ,USB2 High Hot Plug. Peripherals electronics can be relatively simple and inexpensive. Power 500mA from the bus.

76 USB Data Travels in Packets
Identified by “Packet ID” (PID) Token packet tells what’s coming Data packets deliver bytes Handshake packets report success or otherwise

77 USB Packets Setup Stage Data Stage Data Stage (cont'd)
R C 5 Y N 1 6 Data K Token Packet Data Packet H/S Pkt Setup Stage Data Stage Data Stage (cont'd) A Control Write Transfer O D C S A E C D S S A E C S S O A A R A Y D N R Y a Y Y I D N R Y Y U C T C C N D D C N t N N N D D C N N T K A 1 K C R P 5 C a C C R P 5 C C 1 6 Token Packet Data Packet H/S Pkt Token Packet Data Packet H/S Pkt Status Stage

78 USB2 Controller EZ-USB FX2LP(TM) USB Microcontroller High-Speed USB Peripheral Controller Integrated 8051 Microprocessor. Code/Data Downloaded via USB, or EEPROM. Many Integrated Peripherals.

79

80

81 Simple Algorithm Sample Data at full rate 2.77Ms/s (16 channels)
Down Convert Data to by 4 Write data to USB interface 21.19MB/s

82 VHDL

83 VHDL Example An example of a two-input XNOR gate is shown below.
entity XNOR2 is      port (A, B: in std_logic;            Z: out std_logic);      end XNOR2; architecture behavioral_xnor of XNOR2 is      -- signal declaration (of internal signals X, Y)      signal X, Y: std_logic; begin X <= A and B; Y <= (not A) and (not B); Z <= X or Y; End behavioral_xnor;

84


Download ppt "ADC Board VHDL Firmware development for Mona Lisa"

Similar presentations


Ads by Google