Basic FPGA Architecture 2 - 2 © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Virtex-II Architecture Virtex™-II architecture’s core voltage.

Slides:



Advertisements
Similar presentations
Basic HDL Coding Techniques
Advertisements

Lecture 15 Finite State Machine Implementation
Spartan-3 FPGA HDL Coding Techniques
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Xilinx CPLDs and FPGAs Module F2-1. CPLDs and FPGAs XC9500 CPLD XC4000 FPGA Spartan FPGA Spartan II FPGA Virtex FPGA.
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR Topics n 16 x 16 multiplier example.
Xilinx FPGAs:Evolution and Revolution. Evolution results in bigger, faster, cheaper FPGAs; better software with fewer bugs, faster compile times; coupled.
© 2003 Xilinx, Inc. All Rights Reserved Architecture Wizard and PACE FPGA Design Flow Workshop Xilinx: new module Xilinx: new module.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
VHDL Synthesis in FPGA By Zhonghai Shi February 24, 1998 School of EECS, Ohio University.
Evolution of implementation technologies
Programmable logic and FPGA
Achieving Timing Closure. Achieving Timing Closure - 2 © Copyright 2010 Xilinx Objectives After completing this module, you will be able to:  Describe.
CMPUT Computer Organization and Architecture II1 CMPUT329 - Fall 2003 Topic: Internal Organization of an FPGA José Nelson Amaral.
Achieving Timing Closure. Objectives After completing this module, you will be able to: Describe a flow for obtaining timing closure Interpret a timing.
Global Timing Constraints. Objectives After completing this module you will be able to… Apply global timing constraints to a simple synchronous design.
© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
The Xilinx Spartan 3 FPGA EGRE 631 2/2/09. Basic types of FPGA’s One time programmable Reprogrammable (non-volatile) –Retains program when powered down.
Global Timing Constraints FPGA Design Workshop. Objectives  Apply timing constraints to a simple synchronous design  Specify global timing constraints.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
© 2003 Xilinx, Inc. All Rights Reserved Reading Reports Xilinx: This module was completely redone. Please translate entire module Some pages are the same.
© 2003 Xilinx, Inc. All Rights Reserved FPGA Design Techniques.
Section II Basic PLD Architecture. Section II Agenda  Basic PLD Architecture —XC9500 and XC4000 Hardware Architectures —Foundation and Alliance Series.
Open Discussion of Design Flow Today’s task: Design an ASIC that will drive a TV cell phone Exercise objective: Importance of codesign.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
Advance Digital Design Hassan Bhatti, Lecture 10.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Programmable Logic Devices
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
© 2003 Xilinx, Inc. All Rights Reserved Synchronous Design Techniques.
Basic Sequential Components CT101 – Computing Systems Organization.
This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
ECE 448 Lecture 6 FPGA devices
© 2003 Xilinx, Inc. All Rights Reserved Global Timing Constraints FPGA Design Flow Workshop.
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU FPGA Design with Xilinx ISE Presenter: Shu-yen Lin Advisor: Prof. An-Yeu Wu 2005/6/6.
Introductory project. Development systems Design Entry –Foundation ISE –Third party tools Mentor Graphics: FPGA Advantage Celoxica: DK Design Suite Design.
George Mason University ECE 448 FPGA and ASIC Design with VHDL FPGA Design Flow ECE 448 Lecture 7.
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Basic FPGA Architecture FPGA Design Flow Workshop.
Introduction to FPGA Tools
Tools - Design Manager - Chapter 6 slide 1 Version 1.5 FPGA Tools Training Class Design Manager.
1 Performance Analysis (Clock Signal). 2 Unbalanced delays Logic with unbalanced delays leads to inefficient use of logic: long clock periodshort clock.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Tools - Analyzing your results - Chapter 7 slide 1 Version 1.5 FPGA Tools Course Analyzing your Results.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices ECE 448 Lecture 5.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Basic FPGA Architecture
Sequential Logic Design
Register Files and Memories
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Basic FPGA Architecture
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
The Xilinx Virtex Series FPGA
XC4000E Series Xilinx XC4000 Series Architecture 8/98
FPGA Tools Course Basic Constraints
FPGA Tools Course Answers
Basic Adders and Counters Implementation of Adders
The Xilinx Virtex Series FPGA
Basic FPGA Architecture
An Introduction to FPGA Design
FPGA Tools Course Timing Analyzer
Presentation transcript:

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Virtex-II Architecture Virtex™-II architecture’s core voltage operates at 1.5V I/O Blocks (IOBs) Configurable Logic Blocks (CLBs) Configurable Logic Blocks (CLBs) Clock Management (DCMs, BUFGMUXes) Block SelectRAM™ resource Block SelectRAM™ resource Dedicated multipliers Programmable interconnect

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Slices and CLBs Each Virtex  -II CLB contains four slices – Local routing provides feedback between slices in the same CLB, and it provides routing to neighboring CLBs – A switch matrix provides access to general routing resources CIN Switch Matrix BUFT COUT Slice S0 Slice S1 Local Routing Slice S2 Slice S3 CIN SHIFT

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Slice 0 LUT Carry LUT Carry DQ CE PRE CLR D Q CE PRE CLR Simplified Slice Structure Each slice has four outputs – Two registered outputs, two non-registered outputs – Two BUFTs associated with each CLB, accessible by all 16 CLB outputs Carry logic runs vertically, up only – Two independent carry chains per CLB

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Detailed Slice Structure The next few slides discuss the slice features – LUTs – MUXF5, MUXF6, MUXF7, MUXF8 (only the F5 and F6 MUX are shown in this diagram) – Carry Logic – MULT_ANDs – Sequential Elements

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Combinatorial Logic A B C D Z Look-Up Tables Combinatorial logic is stored in Look-Up Tables (LUTs) – Also called Function Generators (FGs) – Capacity is limited by the number of inputs, not by the complexity Delay through the LUT is constant ABCDZ

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Connecting Look-Up Tables F5 F8 F5 F6 CLB Slice S3 Slice S2 Slice S0 Slice S1 F5 F7 F5 F6 MUXF8 combines the two MUXF7 outputs (from the CLB above or below) MUXF6 combines slices S2 and S3 MUXF7 combines the two MUXF6 outputs MUXF6 combines slices S0 and S1 MUXF5 combines LUTs in each slice

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Fast Carry Logic Simple, fast, and complete arithmetic Logic – Dedicated XOR gate for single-level sum completion – Uses dedicated routing resources – All synthesis tools can infer carry logic

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only CO DICI S LUT CY_MUX CY_XOR MULT_AND A B A x B LUT MULT_AND Gate Highly efficient multiply and add implementation – Earlier FPGA architectures require two LUTs per bit to perform the multiplication and addition – The MULT_AND gate enables an area reduction by performing the multiply and the add in one LUT per bit

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only D CE PRE CLR Q FDCPE D CE S R Q FDRSE D CE PRE CLR Q LDCPE G _1 Flexible Sequential Elements Either flip-flops or latches Two in each slice; eight in each CLB Inputs come from LUTs or from an independent CLB input Separate set and reset controls – Can be synchronous or asynchronous All controls are shared within a slice – Control signals can be inverted locally within a slice

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Shift Register LUT (SRL16CE) Dynamically addressable serial shift registers – Maximum delay of 16 clock cycles per LUT (128 per CLB) – Cascadable to other LUTs or CLBs for longer shift registers Dedicated connection from Q15 to D input of the next SRL16CE – Shift register length can be changed asynchronously by toggling address A LUT DQ CE DQ DQ DQ LUT D CE CLK A[3:0] Q Q15 (cascade out)

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only IOB Element Input path – Two DDR registers Output path – Two DDR registers – Two 3-state enable DDR registers Separate clocks and clock enables for I and O Set and reset signals are shared Reg DDR MUX 3-state OCK1 OCK2 Reg DDR MUX Output OCK1 OCK2 PAD Reg Input ICK1 ICK2 IOB

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Distributed SelectRAM Resources Uses a LUT in a slice as memory Synchronous write Asynchronous read – Accompanying flip-flops can be used to create synchronous read RAM and ROM are initialized during configuration – Data can be written to RAM after configuration Emulated dual-port RAM – One read/write port – One read-only port RAM16X1S O D WE WCLK A0 A1 A2 A3 LUT RAM32X1S O D WE WCLK A0 A1 A2 A3 A4 RAM16X1D SPO D WE WCLK A0 A1 A2 A3 DPRA0DPO DPRA1 DPRA2 DPRA3 Slice LUT

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Block SelectRAM Resources Up to 3.5 Mb of RAM in 18-kb blocks – Synchronous read and write True dual-port memory – Each port has synchronous read and write capability – Different clocks for each port Supports initial values Synchronous reset on output latches Supports parity bits – One parity bit per eight data bits DIA DIPA ADDRA WEA ENA SSRA CLKA DIB DIPB WEB ADDRB ENB SSRB DOA CLKB DOPA DOPB DOB 18-kb block SelectRAM memory

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only ConfigurationDepthData BitsParity Bits 16k x 116 kb10 8k x 28 kb20 4k x 44 kb40 2k x 92 kb81 1k x 181 kb x Dual-Port Block RAM Configurations Configurations available on each port Independent configurations on ports A and B – Supports data-width conversion, including parity bits Port A: 8 bits IN 8 bit OUT 32 bit Port B: 32 bits

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Dedicated Multiplier Blocks 18-bit twos complement signed operation Optimized to implement Multiply and Accumulate functions Multipliers are physically located next to block SelectRAM™ memory 18 x 18 Multiplier 18 x 18 Multiplier Output (36 bits) Data_A (18 bits) Data_B (18 bits) 4 x 4 signed 8 x 8 signed 12 x 12 signed 18 x 18 signed

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Translate Map Place & Route Xilinx Design Flow Plan & Budget HDL RTL Simulation Synthesize to create netlist Functional Simulation Create BIT File Attain Timing Closure Timing Simulation Implement Create Code/ Schematic

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Implementation Once you generate a netlist, you can implement the design There are several outputs of implementation – Reports – Timing simulation netlists – Floorplan files – FPGA Editor files – and more! Translate Map Place & Route Implement

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only What is Implementation? More than just Place & Route Implementation includes many phases – Translate: Merge multiple design files into a single netlist – Map: Group logical symbols from the netlist (gates) into physical components (slices and IOBs) – Place & Route: Place components onto the chip, connect the components, and extract timing data into reports Each phase generates files that allow you to use other Xilinx tools – Floorplanner, FPGA Editor, XPower

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Project Summary Design Overview Device Utilization Performance and Constraints Reports

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Map Reports Map Report contents – Command line options for the map program – Design summary List of how many device resources are used – Errors and warnings – Removed logic summary List of logic that was removed due to sourceless or loadless nets – IOB properties Indicates whether an I/O flip-flop is used List of attributes on each I/O pin Post-Map Static Timing Report not covered here

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Map Report Example Release 4.1i - Map E.30 Xilinx Mapping Report File for Design 'top' Design Information Command Line : map -p xc2v40-fg cm area -k 4 -c 100 -tx off top.ngd Target Device : x2v40 Target Package : fg256 Target Speed : -4 Mapper Version : virtex2 -- $Revision: 1.58 $ Mapped Date : Tue Aug 21 09:42: Design Summary

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Map Report Example Number of errors: 0 Number of warnings: 0 Number of Slices: 182 out of % Number of Slices containing unrelated logic: 0 out of 182 0% Number of Slice Flip Flops: 170 out of % Total Number 4 input LUTs: 248 out of % Number used as LUTs: 167 Number used as a route-thru: 81 Number of bonded IOBs: 26 out of 88 29% Number of GCLKs: 1 out of 16 6% Total equivalent gate count for design: 3,475 Additional JTAG gate count for IOBs: 1,248

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Place & Route Reports Place & Route Report contents – Command line options for the par program – Errors and warnings – Device utilization summary Similar to the Design Summary from the Map Report – Unrouted nets – Timing summary Statistics on average routing delays Performance versus constraints if the design contains timing constraints

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Timing Reports Timing Report contents (for designs with constraints) – Command line options for the trce program – Timing Constraints section Summary of each timing constraint Details on paths that fail to meet constraints – Data Sheet section Setup/hold, clock to pad, timing between clock domains, and pad-to-pad delay information Organized in easy-to-read table format – Timing Summary section Number of errors and Timing Score Constraint coverage

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Timing Report Example Release 4.1i - Trace E.30 Copyright (c) Xilinx, Inc. All rights reserved. trce -e 3 -l 3 -xml top top.ncd -o top.twr top.pcf Design file: top.ncd Physical constraint file: top.pcf Device,speed: xc2v40,-4 (ADVANCED ) Report level: error report WARNING:Timing - No timing constraints found, doing default enumeration. ================================================================================ Timing constraint: Default period analysis 8292 items analyzed, 0 timing errors detected. Minimum period is 8.852ns. Maximum delay is ns

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Timing Report Example All constraints were met. Data Sheet report: All values displayed in nanoseconds (ns) Clock FiftyM_clk to Pad | clk (edge) | Destination Pad| to PAD | EN | (R)| half1 | 9.465(R)| half2 | 9.166(F)| half3 | 9.740(R)| half4 | 9.174(F)|

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Without Timing Constraints  This design had no timing constraints or pin assignments entered when it was implemented  Note the logical structure of the placement and pins.  Xilinx recommends that you compile your design at least once without timing constraints or pin assignments  This design has a maximum system clock frequency of 50 MHz

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only With Timing Constraints  This is the same design with three global timing constraints entered with the Constraints Editor  It has a maximum system clock frequency of 60 MHz  Note how most of the logic is placed closer to the edge of the device where the pins have been placed

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Period Constraint  In this example the Period constraint optimizes all delay paths between flip-flops  The Period constraint does NOT optimize delay paths from input pads to output pads (purely combinatorial), paths from input pads to flip-flops, or paths from flip-flops to output pads = Combinatorial Logic BUFG CLK ADATA OUT2 OUT1 Q FLOP3 DQ FLOP1 D Q FLOP5 D Q FLOP4 D BUS [7..0] CDATA Q FLOP2 D

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only The Period Constraint  A synchronous element is a flip-flop, latch, or a synchronous RAM  The Period constraint covers paths… – Between synchronous elements which are clocked by the reference net  Synchronous elements are grouped by the clock signal driving them. This is called forward propagation and enables constraining large pieces of logic with a single constraint

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Offset Constraint  In this example, the Offset constraint optimizes delay paths from input pads to flip-flops and paths from flip-flops to output pads = Combinatorial Logic BUFG CLK ADATA OUT2 OUT1 Q FLOP DQ D Q D Q D BUS [7..0] CDATA Q FLOP D Offset InOffset Out

Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only The Offset Constraint  The Offset constraint covers paths… – From input pads to synchronous elements clocked by the reference net (Offset In) – From synchronous elements to output pads clocked by the reference net (Offset Out)  Note, that this constraint does not cover paths… – Between synchronous elements – From pads to pads (purely combinatorial paths)