XC6200 Family FPGAs By: Ahmad Alsolaim Alsolaim.

Slides:



Advertisements
Similar presentations
Field Programmable Gate Array
Advertisements

ECE 506 Reconfigurable Computing ece. arizona
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Xilinx CPLDs and FPGAs Module F2-1. CPLDs and FPGAs XC9500 CPLD XC4000 FPGA Spartan FPGA Spartan II FPGA Virtex FPGA.
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 9 Programmable Configurations Read Only Memory (ROM) – –a fixed array of AND gates.
Altera FLEX 10K technology in Real Time Application.
Programmable Logic Devices
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
Graduate Computer Architecture I Lecture 15: Intro to Reconfigurable Devices.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
EECE579: Digital Design Flows
Implementing Logic Gates and Circuits Discussion D5.1.
Implementing Logic Gates and Circuits Discussion D5.3 Section 11-2.
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Configurable System-on-Chip: Xilinx EDK
Evolution of implementation technologies
Programmable logic and FPGA
CS294-6 Reconfigurable Computing Day 2 August 27, 1998 FPGA Introduction.
Multiplexers, Decoders, and Programmable Logic Devices
February 4, 2002 John Wawrzynek
Lecture 3 1 ECE 412: Microcomputer Laboratory Lecture 3: Introduction to FPGAs.
Implementing Digital Circuits Lecture L3.1. Implementing Digital Circuits Transistors and Integrated Circuits Transistor-Transistor Logic (TTL) Programmable.
Address Decoding Outline Goal Reading Address Decoding Strategy
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Chapter 6 Memory and Programmable Logic Devices
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Section I Introduction to Xilinx
EE4OI4 Engineering Design Programmable Logic Technology.
Section II Basic PLD Architecture. Section II Agenda  Basic PLD Architecture —XC9500 and XC4000 Hardware Architectures —Foundation and Alliance Series.
Electronics in High Energy Physics Introduction to Electronics in HEP Field Programmable Gate Arrays Part 1 based on the lecture of S.Haas.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
Xilinx Development Software Design Flow on Foundation M1.5
J. Christiansen, CERN - EP/MIC
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Programmable Logic Devices
Sept. 2005EE37E Adv. Digital Electronics Lesson 1 CPLDs and FPGAs: Technology and Design Features.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
Basic Sequential Components CT101 – Computing Systems Organization.
BR 1/991 Issues in FPGA Technologies Complexity of Logic Element –How many inputs/outputs for the logic element? –Does the basic logic element contain.
EE3A1 Computer Hardware and Digital Design
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Development of Programmable Architecture for Base-Band Processing S. Leung, A. Postula, Univ. of Queensland, Australia A. Hemani, Royal Institute of Tech.,
CPLD Vs. FPGA Positioning Presentation
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing HW, VHDL 2 Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture 2:
M.Mohajjel. Why? TTM (Time-to-market) Prototyping Reconfigurable and Custom Computing 2Digital System Design.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
Survey of Reconfigurable Logic Technologies
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Introduction to Field Programmable Gate Arrays (FPGAs) EDL Spring 2016 Johns Hopkins University Electrical and Computer Engineering March 2, 2016.
FPGA Technology Overview Carl Lebsack * Some slides are from the “Programmable Logic” lecture slides by Dr. Morris Chang.
B0110 Fabric and Trust ENGR xD52 Eric VanWyk Fall 2013.
Introduction to the FPGA and Labs
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Instructor: Dr. Phillip Jones
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
We will be studying the architecture of XC3000.
Lecture 41: Introduction to Reconfigurable Computing
The Xilinx Virtex Series FPGA
Programmable Configurations
The Xilinx Virtex Series FPGA
Implementing Logic Gates and Circuits
Programmable logic and FPGA
Presentation transcript:

XC6200 Family FPGAs By: Ahmad Alsolaim Alsolaim

Agenda XC6200 Architecture Design Flows Library Support Applications Reconfigurable Processing

Problems Confronting Embedded Control Designers Today Reconfiguration from external memory limited to low frequency Memory CPU I/O High frequency access to registers needed Microprocessor interface consumes resources Reconfigurable Coprocessor (FPGA) Bus access to large number of internal registers requires careful design Insufficient memory capacity for coprocessing algorithms Partial Reconfiguration is difficult I/O

XC6200 System Features Meet Embedded Coprocessing Requirements 1000x improvement in reconfiguration time from external memory CPU Memory I/O FastMAPtm assures high speed access to all internal registers Microprocessor interface built-in Reconfigurable Coprocessor XC6200 All registers accessed via built-in low-skew FastMAPtm busses High capacity distributed memory permits allocation of chip resources to logic or memory Ultrafast Partial Reconfiguration fully supported I/O Up to 100,000 gates !

XC6200 Architectural Overview Array of fine grain function cells, each with a register high gate count for structured logic or regular arrays Abundant, hierarchical routing resources Flexible pin configuration programmable as in, out, bidirectional, tristate CMOS or TTL logic levels

XC6200 Architecture (cont) High speed CPU interface for configuration and register I/O Programmable bus width (8..32-bits) Direct processor read/write access to all user registers All user registers and configuration SRAM mapped into processor address space

XC6200 Architecture Alsolaim 16x16 Tile 4x4 Block Ÿ Ÿ Ÿ FastMAPtm User I/Os Ÿ Ÿ Ÿ FastMAPtm Interface Ÿ Ÿ User I/Os Ÿ User I/Os Ÿ Address Ÿ Ÿ Function Cell Data Control Ÿ Ÿ Ÿ User I/Os ŸNumber of tiles varies between devices in family Alsolaim

Logical Organization: Basic Cell. Alsolaim

Logical Organization: XC6200 Function Unit Function unit allows : any function of 2 variables any flavour of 2:1 mux buffers, inverters, or constant 0s and 1s any of the above in addition to a D-type register 3 I/Ps, each from any of 8 directions; O/P to up to 4 directions

Logical Organization: Function Unit. Figure 6: XC6200 Function Unit Alsolaim

Logical Organization: Function Unit. (cont) Alsolaim

Logical Organization: Function Unit. (cont) Alsolaim

Physical Organization: Cells, Blocks and Tiles Alsolaim

Physical Organization: Cells, Blocks and Tiles (cont) Alsolaim

Routing Resources Example Alsolaim

Routing Switches: Alsolaim

North and South Switches: Alsolaim

East and West Switches: Alsolaim

Clock Distribution: Alsolaim

Clear Distribution: Alsolaim

Input/Output Architecture: Alsolaim

Connections Between IOB’s And Built-In XC6200 Control Logic: Alsolaim

Array Data Sources In West IOB’s: Alsolaim

XC6200 Device Organization Conceptual view Logic symbol G1 G2 GClk GClr OE Reset I/O CS A(15:0) D(31:0) RdWr Logic Array RAM Interface Programmable I/O Alsolaim

The industry’s only random access configuration interface FastMAP CPU Interface The industry’s only random access configuration interface allows for extremely fast full or partial device configuration - you only program the bits you need Allows direct CPU (random) access to user registers supports “coprocessing” applications.

FastMAP CPU Interface (cont) Easily interfaced to most microprocessors and microcontrollers “memory mapped” architecture makes it just like designing with SRAM

FastMAP (cont) Map Register Cell Array Map Register allows mapping of user registers on to 8, 16, or 32 bit data bus Allows unconstrained register placement Obviates need for complex shift and mask operations 1 bit 7 bit 6 bit 5 1 1 Data Bus bit 4 bit 3 bit 2 1 bit 1 bit 0 1 1 User-defined register Cells

Wildcard Registers allow “don’t cares” on address bits FastMAP (cont) Wildcard Registers allow “don’t cares” on address bits same data can be written to several locations (SRAM and user registers) in one cycle fast configuration of bit-slice type designs broadcast of data to registers without tying up valuable routing resources.

Partial Run-time Reconfiguration Extend hardware to a larger (virtual) capacity through rapid reconfiguration Derive time-varying structures that are smaller and faster than the ASIC counterpart Make more transistors participate in a given computation Alsolaim 8 6

Partial Run-time Reconfiguration Alsolaim 5 4

Partial Run-time Reconfiguration Time = <a short time later> Alsolaim 6 5

Reconfiguration Speed vs Traditional Technologies Design Swapping XC4013 200us 250ms Block Swapping XC6216 Circuit Updates Rewiring 40ns ns us ms s

XC6200 Family Members Alsolaim

Design Flows Device Configuration Hierarchical EDIF Schematic Capture VHDL Synthesis Macro Libraries XACTstep Series 6000 Delay File Device Configuration

Primitive gates and functions (compatible with other Xilinx parts) Library Support Primitive gates and functions (compatible with other Xilinx parts) AND, OR, ADD, MULT, etc More complex macros also to be available memory access DSP functions (FIR, FFT, DCT) JTAG, decoders, etc.

Can be used as “regular” FPGA Applications Can be used as “regular” FPGA serial interface allows for booting from PROM Intended to act as hardware accelerator for microprocessors FastMAP allows for direct microprocessor access to “internal” logic fast reconfiguration of all or part of device

Applications (cont) “Context switching” and “virtual hardware” are realistic propositions Typical uses might include DSP, image processing, datapaths, etc.

Reconfigurable Processing “Custom computing” concept, building on fast configuration virtual hardware PCI based development system to be made available can be used as a custom computer in its own right, or as an aid to system development for customers’ designs

XC6000 Software: XACT6000 Software From Xilinx. (will be available soon in our lab) Trianus/Hades Design Entry Software for the XC6200.(available in our lab) Velab: Free VHDL Elaborator for the XC6200. (available in our lab) XC6200 Inspector. (available in our lab) Alsolaim

A Multiplier for the XC6200

A Multiplier for the XC6200 Structure Math Building Lookup Tables Area Optimization Mapping into an XC6200 Changing Coefficients Performance Summary

Distributed Arithmetic (Multiplier) 8 bit data 4 4 LUT LUT 16 X 12 16 X 12 12 8 12 bit adder 12 4

Math Class Constant LUT-B Input LUT-A Input LUT-A Output LUT-B Output Adder Output

Architecture of the Multiplier Pipelined Lookup Tables LUT-B LUT-A B[11:8] B[7:4] A[11:8] B[3:0] A[7:4] A[3:0] Pipelined Adder 4-bit half 4-bit full 4-bit half Pipeline Register Carry Carry P[15:12] P[11:8] P[7:4] P[3:0]

Lookup Table contains all pre-calculated partial products. LUTs by Muxing Lookup Table contains all pre-calculated partial products. Use a Truth Table to determine Mux inputs. All possible products for multiplying by 0011 (3) A0 A1 A2 A3 Px

Two mux levels can be collapsed into a single gate. Optimizing the Lookup Two mux levels can be collapsed into a single gate. The function can be determined with a truth table. No optimization XOR Func1 Optimized A0 A1 NAND Func2 Px A3 A2 ? OR Func3 ? ? BUF Func4 ?

Multiplier Schematic Schematic resembles the block diagram. Two LUTs sourcing adder. The corresponding view in the layout editor. The LUTs are offset to line up bits for adder. Pipeline registers are cheap. XC6216 has 4096 Flip Flops LUT-A LUT-B ADDER

A Closer Look at a Lookup Each 12-bit LUT is built from 12 one bit LUTs. LUTs get stacked vertically.

Determining Coefficients Schematic for a single 4-input LUT. Functions can be determined from the Truth Table.

Changing Coefficients Functionality of a cell is contained in one byte. 32-bit access can change the function of 4 cells per write cycle. 96 cells need writing, or 24 write cycles. (worst case) 1.45ms assuming 33MHz Func1 Func2 Func3 Func4

Summary 8x8 constant coefficient multiplier Pipelined - 75+ MHz performance Small grain architecture - High degree of LUT optimization Coefficients easily changed - Fast reconfig times. High Performance/Dollar