George Mason University ECE 645 – Computer Arithmetic Introduction to FPGA Devices.

Slides:



Advertisements
Similar presentations
FPGA (Field Programmable Gate Array)
Advertisements

ECE 506 Reconfigurable Computing ece. arizona
Lecture 15 Finite State Machine Implementation
Xilinx FPGAs:Evolution and Revolution. Evolution results in bigger, faster, cheaper FPGAs; better software with fewer bugs, faster compile times; coupled.
Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Virtex-II Architecture Virtex™-II architecture’s core voltage.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Lecture 11 Xilinx FPGA Memories
FPGA Devices & FPGA Design Flow
ECE 448 Lecture 7 FPGA Devices
Lecture 2: Field Programmable Gate Arrays I September 5, 2013 ECE 636 Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays I.
FPGAs and VHDL Lecture L12.1. FPGAs and VHDL Field Programmable Gate Arrays (FPGAs) VHDL –2 x 1 MUX –4 x 1 MUX –An Adder –Binary-to-BCD Converter –A Register.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Programmable logic and FPGA
FPGAs and VHDL Lecture L13.1 Sections 13.1 – 13.3.
Introduction to Field Programmable Gate Arrays (FPGAs) COE 203 Digital Logic Laboratory Dr. Aiman El-Maleh College of Computer Sciences and Engineering.
February 4, 2002 John Wawrzynek
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Overview of Modern FPGAs ECE 448 Lecture 14.
CMPUT Computer Organization and Architecture II1 CMPUT329 - Fall 2003 Topic: Internal Organization of an FPGA José Nelson Amaral.
Basic Adders and Counters Implementation of Adders in FPGAs ECE 645: Lecture 3.
The Xilinx Spartan 3 FPGA EGRE 631 2/2/09. Basic types of FPGA’s One time programmable Reprogrammable (non-volatile) –Retains program when powered down.
ECE 448 – FPGA and ASIC Design with VHDL Lecture 10 Memories (RAM/ROM)
Section I Introduction to Xilinx
George Mason University FPGA Memories ECE 448 Lecture 13.
COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman.
Electronics in High Energy Physics Introduction to Electronics in HEP Field Programmable Gate Arrays Part 1 based on the lecture of S.Haas.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
ECE 448 – FPGA and ASIC Design with VHDL Lecture 11 Memories in Xilinx FPGAs.
J. Christiansen, CERN - EP/MIC
George Mason University ECE 645 Lecture 7 FPGA Embedded Resources.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
George Mason University ECE 545 – Introduction to VHDL Memories: RAM, ROM Advanced Testbenches ECE 545 Lecture 9.
® Spartan-II High Volume Solutions Overview. ® High Performance System Features Software and Cores Smallest Die Size Lowest Possible Cost.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Reconfigurable Computing - FPGA structures John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
ECE 448 Lecture 6 FPGA devices
George Mason University ECE 545 – Introduction to VHDL Variables, Functions, Memory, File I/O ECE 545 Lecture 7.
EE3A1 Computer Hardware and Digital Design
George Mason University ECE 449 – Computer Design Lab Introduction to FPGA Devices & Tools.
Sept. 2005EE37E Adv. Digital Electronics Lesson 1 (Part 2) FPGA Architectures.
Introduction to FPGA Tools
George Mason University ECE 448 – FPGA and ASIC Design with VHDL ECE 448 Lecture 10 Memories: RAM, ROM.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
FPGA Devices & FPGA Design Flow
CDA 4253 FGPA System Design Xilinx FPGA Memories
Lecture 10 Xilinx FPGA Memories Part 1
Survey of Reconfigurable Logic Technologies
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices ECE 448 Lecture 5.
Delivered by.. Love Jain p08ec907. Design Styles  Full-custom  Cell-based  Gate array  Programmable logic Field programmable gate array (FPGA)
Lecture 11 Xilinx FPGA Memories Part 2
George Mason University FPGA Devices & FPGA Design Flow ECE 545 Lecture 8.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices & FPGA Design Flow ECE 448 Lecture 6.
George Mason University ECE 545 Lecture 12 FPGA Embedded Resources.
FPGA Devices & FPGA Design Flow
FPGA 상명대학교 소프트웨어학부 2007년 1학기.
Register Files and Memories
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Programmable Logic Memories
COE 405 Programmable Logic and Storage Devices
FPGA Devices & FPGA Tools
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
Programmable Logic Memories
The Xilinx Virtex Series FPGA
ECE 448 Lecture 7 FPGA Devices
ECE 448 Lecture 5 FPGA Devices
Basic Adders and Counters Implementation of Adders
The Xilinx Virtex Series FPGA
Memories: RAM, ROM Advanced Testbenches
Presentation transcript:

George Mason University ECE 645 – Computer Arithmetic Introduction to FPGA Devices

2ECE 645 – Computer Arithmetic World of Integrated Circuits Integrated Circuits Full-Custom ASICs Semi-Custom ASICs User Programmable PLDFPGA PALPLAPML LUT (Look-Up Table) MUXGates

3ECE 645 – Computer Arithmetic designs must be sent for expensive and time consuming fabrication in semiconductor foundry bought off the shelf and reconfigured by designers themselves Two competing implementation approaches ASIC Application Specific Integrated Circuit FPGA Field Programmable Gate Array designed all the way from behavioral description to physical layout no physical layout design; design ends with a bitstream used to configure a device

4ECE 645 – Computer Arithmetic Block RAMs Configurable Logic Blocks I/O Blocks What is an FPGA? Block RAMs

5ECE 645 – Computer Arithmetic Which Way to Go? Off-the-shelf Low development cost Short time to market Reconfigurability High performance ASICsFPGAs Low power Low cost in high volumes

6ECE 645 – Computer Arithmetic Other FPGA Advantages Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower Mistakes not detected at design time have large impact on development time and cost FPGAs are perfect for rapid prototyping of digital circuits Easy upgrades like in case of software Unique applications reconfigurable computing

7ECE 645 – Computer Arithmetic Major FPGA Vendors SRAM-based FPGAs Xilinx, Inc. Altera Corp. Atmel Lattice Semiconductor Flash & antifuse FPGAs Actel Corp. Quick Logic Corp. Share over 60% of the market

8ECE 645 – Computer Arithmetic Xilinx  Primary products: FPGAs and the associated CAD software  Main headquarters in San Jose, CA  Fabless* Semiconductor and Software Company  UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996}  Seiko Epson (Japan)  TSMC (Taiwan ) Programmable Logic Devices ISE Alliance and Foundation Series Design Software

9ECE 645 – Computer Arithmetic Xilinx FPGA Families Old families XC3000, XC4000, XC5200 Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs. High-performance families Virtex (0.22µm) Virtex-E, Virtex-EM (0.18µm) Virtex-II, Virtex-II PRO (0.13µm) Virtex-4 (0.09µm) Low Cost Family Spartan/XL – derived from XC4000 Spartan-II – derived from Virtex Spartan-IIE – derived from Virtex-E Spartan-3

10ECE 645 – Computer Arithmetic

11ECE 645 – Computer Arithmetic Xilinx FPGA Block Diagram

12ECE 645 – Computer Arithmetic CLB Structure

13ECE 645 – Computer Arithmetic CLB Slice Structure Each slice contains two sets of the following: Four-input LUT Any 4-input logic function, or 16-bit x 1 sync RAM or 16-bit shift register Carry & Control Fast arithmetic logic Multiplier logic Multiplexer logic Storage element Latch or flip-flop Set and reset True or inverted inputs Sync. or async. control

14ECE 645 – Computer Arithmetic LUT (Look-Up Table) Functionality Look-Up tables are primary elements for logic implementation Each LUT can implement any function of 4 inputs

15ECE 645 – Computer Arithmetic 5-Input Functions implemented using two LUTs One CLB Slice can implement any function of 5 inputs Logic function is partitioned between two LUTs F5 multiplexer selects LUT

16ECE 645 – Computer Arithmetic 5-Input Functions implemented using two LUTs LUT OUT

17ECE 645 – Computer Arithmetic RAM16X1S O D WE WCLK A0 A1 A2 A3 RAM32X1S O D WE WCLK A0 A1 A2 A3 A4 RAM16X2S O1 D0 WE WCLK A0 A1 A2 A3 D1 O0 = = LUT or LUT RAM16X1D SPO D WE WCLK A0 A1 A2 A3 DPRA0DPO DPRA1 DPRA2 DPRA3 or Distributed RAM CLB LUT configurable as Distributed RAM A LUT equals 16x1 RAM Implements Single and Dual- Ports Cascade LUTs to increase RAM size Synchronous write Synchronous/Asynchronous read Accompanying flip-flops used for synchronous read

18ECE 645 – Computer Arithmetic DQ CE DQ DQ DQ LUT IN CE CLK DEPTH[3:0] OUT LUT = Shift Register Each LUT can be configured as shift register Serial in, serial out Dynamically addressable delay up to 16 cycles For programmable pipeline Cascade for greater cycle delays Use CLB flip-flops to add depth

19ECE 645 – Computer Arithmetic Shift Register Register-rich FPGA Allows for addition of pipeline stages to increase throughput Data paths must be balanced to keep desired functionality 64 Operation A 4 Cycles8 Cycles Operation B 3 Cycles Operation C Cycles 3 Cycles 9-Cycle imbalance

20ECE 645 – Computer Arithmetic COUT D Q CK S R EC D Q CK R EC O G4 G3 G2 G1 Look-Up Table Carry & Control Logic O YB Y F4 F3 F2 F1 XB X Look-Up Table F5IN BY SR S Carry & Control Logic CIN CLK CE SLICE Carry & Control Logic

21ECE 645 – Computer Arithmetic  Each CLB contains separate logic and routing for the fast generation of sum & carry signals Increases efficiency and performance of adders, subtractors, accumulators, comparators, and counters  Carry logic is independent of normal logic and routing resources Fast Carry Logic LSB MSB Carry Logic Routing

22ECE 645 – Computer Arithmetic Accessing Carry Logic  All major synthesis tools can infer carry logic for arithmetic functions Addition (SUM <= A + B) Subtraction (DIFF <= A - B) Comparators (if A < B then…) Counters (count <= count +1)

23ECE 645 – Computer Arithmetic Block RAM Spartan-II True Dual-Port Block RAM Port A Port B Block RAM Most efficient memory implementation Dedicated blocks of memory Ideal for most memory requirements 4 to 104 memory blocks 18 kbits = 18,432 bits per block Use multiple blocks for larger memories Builds both single and true dual-port RAMs

24ECE 645 – Computer Arithmetic Spartan-3 Block RAM Amounts

25ECE 645 – Computer Arithmetic Block RAM Port Aspect Ratios

26ECE 645 – Computer Arithmetic Block RAM Port Aspect Ratios 0 16, , , k x 1 8k x 2 4k x 4 2k x (8+1) 1024 x (16+2)

27ECE 645 – Computer Arithmetic Dual Port Block RAM

28ECE 645 – Computer Arithmetic RAMB4_S4_S16 Port A Out 18-Bit Width Port B In 2k-Bit Depth Port A In 1K-Bit Depth Port B Out 9-Bit Width DOA[17:0] DOB[8:0] WEA ENA RSTA ADDRA[9:0] CLKA DIA[17:0] WEB ENB RSTB ADDRB[8:0] CLKB DIB[15:0] Dual-Port Bus Flexibility Each port can be configured with a different data bus width Provides easy data width conversion without any additional logic

29ECE 645 – Computer Arithmetic VCC, ADDR[12:0] GND, ADDR[12:0] RAMB4_S1_S1 Port B Out 1-Bit Width DOA[0] DOB[0] WEA ENA RSTA ADDRA[12:0] CLKA DIA[0] WEB ENB RSTB ADDRB[12:0] CLKB DIB[0] Port B In 8K-Bit Depth Port A Out 1-Bit Width Port A In 8K-Bit Depth Two Independent Single-Port RAMs To access the lower RAM Tie the MSB address bit to Logic Low To access the upper RAM Tie the MSB address bit to Logic High Added advantage of True Dual- Port No wasted RAM Bits Can split a Dual-Port 16K RAM into two Single-Port 8K RAM Simultaneous independent access to each RAM

30ECE 645 – Computer Arithmetic New 18 x 18 Embedded Multiplier Fast arithmetic functions Optimized to implement multiply / accumulate modules

31ECE 645 – Computer Arithmetic 18 x 18 Multiplier Embedded 18-bit x 18-bit multiplier 2’s complement signed operation Multipliers are organized in columns 18 x 18 Multiplier Output (36 bits) Data_A (18 bits) Data_B (18 bits) Note: See Virtex-II Data Sheet for updated performances

32ECE 645 – Computer Arithmetic Basic I/O Block Structure D EC Q SR D EC Q SR D EC Q SR Three-State Control Output Path Input Path Three-State Output Clock Set/Reset Direct Input Registered Input FF Enable

33ECE 645 – Computer Arithmetic IOB Functionality IOB provides interface between the package pins and CLBs Each IOB can work as uni- or bi-directional I/O Outputs can be forced into High Impedance Inputs and outputs can be registered advised for high-performance I/O Inputs can be delayed

34ECE 645 – Computer Arithmetic Routing Resources PSM CLB PSM CLB Programmable Switch Matrix

35ECE 645 – Computer Arithmetic Clock Distribution

36ECE 645 – Computer Arithmetic Spartan-3 FPGA Family Members

37ECE 645 – Computer Arithmetic FPGA Nomenclature

38ECE 645 – Computer Arithmetic Device Part Marking We’re Using: XC3S100-4FG256

39ECE 645 – Computer Arithmetic

40ECE 645 – Computer Arithmetic Virtex-II 1.5V Architecture C onfigurable L ogic B lock Block RAMs I / O B lock Multipliers 18 x 18 Block RAMs Multipliers 18 x 18 Block RAMs Multipliers 18 x 18 Block RAMs Multipliers 18 x 18

41ECE 645 – Computer Arithmetic Virtex-II 1.5V DeviceCLB Array SlicesMaximum I/O BlockRAM (18kb) Multiplier Blocks Distributed RAM bits XC2V408x ,192 XC2V8016x ,384 XC2V25024x161, ,152 XC2V50032x243, ,304 XC2V100040x325, ,840 XC2V150048x407, ,760 XC2V200056x4810, ,064 XC2V300064x5614, ,752 XC2V400080x7223, ,280 XC2V600096x8833,7921, ,081,344 XC2V x10446,5921, ,490,944

42ECE 645 – Computer Arithmetic Virtex-II Block SelectRAM Virtex-II BRAM is 18 kbits Additional “parity” bits available in selected configurations WidthDepthAddressDataParity 116,386[13:0][0]N/A 28,192[12:0][1:0]N/A 44,096[11:0][3:0]N/A 92,048[10:0][7:0][0] 181,024[9:0][15:0][1:0] 36512[8:0][31:0][3:0]

George Mason University ECE 645 – Computer Arithmetic Using Library Components in VHDL Code

44ECE 645 – Computer Arithmetic RAM 16x1 (1) library IEEE; use IEEE.STD_LOGIC_1164.all; library UNISIM; use UNISIM.all; entity RAM_16X1_DISTRIBUTED is port( CLK : in STD_LOGIC; WE : in STD_LOGIC; ADDR : in STD_LOGIC_VECTOR(3 downto 0); DATA_IN : in STD_LOGIC; DATA_OUT : out STD_LOGIC ); end RAM_16X1_DISTRIBUTED;

45ECE 645 – Computer Arithmetic RAM 16x1 (2) architecture RAM_16X1_DISTRIBUTED_STRUCTURAL of RAM_16X1_DISTRIBUTED is attribute INIT : string; attribute INIT of RAM16X1_S_1: label is "F0C1"; -- Component declaration of the "ram16x1s(ram16x1s_v)" unit -- File name contains "ram16x1s" entity:./src/unisim_vital.vhd component ram16x1s generic( INIT : BIT_VECTOR(15 downto 0) := X"0000"); port( O : out std_ulogic; A0 : in std_ulogic; A1 : in std_ulogic; A2 : in std_ulogic; A3 : in std_ulogic; D : in std_ulogic; WCLK : in std_ulogic; WE : in std_ulogic); end component;

46ECE 645 – Computer Arithmetic RAM 16x1 (3) begin RAM_16X1_S_1: ram16x1s generic map (INIT => X"F0C1") port map (O=>DATA_OUT, A0=>ADDR(0), A1=>ADDR(1), A2=>ADDR(2), A3=>ADDR(3), D=>DATA_IN, WCLK=>CLK, WE=>WE ); end RAM_16X1_DISTRIBUTED_STRUCTURAL;

47ECE 645 – Computer Arithmetic RAM 16x8 (1) library IEEE; use IEEE.STD_LOGIC_1164.all; library UNISIM; use UNISIM.all; entity RAM_16X8_DISTRIBUTED is port( CLK : in STD_LOGIC; WE : in STD_LOGIC; ADDR : in STD_LOGIC_VECTOR(3 downto 0); DATA_IN : in STD_LOGIC_VECTOR(7 downto 0); DATA_OUT : out STD_LOGIC_VECTOR(7 downto 0) ); end RAM_16X8_DISTRIBUTED;

48ECE 645 – Computer Arithmetic RAM 16x8 (2) architecture RAM_16X8_DISTRIBUTED_STRUCTURAL of RAM_16X8_DISTRIBUTED is attribute INIT : string; attribute INIT of RAM16X1_S_1: label is "0000"; -- Component declaration of the "ram16x1s(ram16x1s_v)" unit -- File name contains "ram16x1s" entity:./src/unisim_vital.vhd component ram16x1s generic( INIT : BIT_VECTOR(15 downto 0) := X"0000"); port( O : out std_ulogic; A0 : in std_ulogic; A1 : in std_ulogic; A2 : in std_ulogic; A3 : in std_ulogic; D : in std_ulogic; WCLK : in std_ulogic; WE : in std_ulogic); end component;

49ECE 645 – Computer Arithmetic RAM 16x8 (3) begin GENERATE_MEMORY: for I in 0 to 7 generate RAM_16X1_S_1: ram16x1s generic map (INIT => X"0000") port map (O=>DATA_OUT(I), A0=>ADDR(0), A1=>ADDR(1), A2=>ADDR(2), A3=>ADDR(3), D=>DATA_IN(I), WCLK=>CLK, WE=>WE ); end generate; end RAM_16X8_DISTRIBUTED_STRUCTURAL;

50ECE 645 – Computer Arithmetic ROM 16x1 (1) library IEEE; use IEEE.STD_LOGIC_1164.all; library UNISIM; use UNISIM.all; entity ROM_16X1_DISTRIBUTED is port( ADDR : in STD_LOGIC_VECTOR(3 downto 0); DATA_OUT : out STD_LOGIC ); end ROM_16X1_DISTRIBUTED;

51ECE 645 – Computer Arithmetic ROM 16x1 (2) architecture ROM_16X1_DISTRIBUTED_STRUCTURAL of ROM_16X1_DISTRIBUTED is attribute INIT : string; attribute INIT of ROM16X1_S_1: label is "F0C1"; component ram16x1s generic( INIT : BIT_VECTOR(15 downto 0) := X"0000"); port( O : out std_ulogic; A0 : in std_ulogic; A1 : in std_ulogic; A2 : in std_ulogic; A3 : in std_ulogic; D : in std_ulogic; WCLK : in std_ulogic; WE : in std_ulogic); end component; signal Low : std_ulogic := ‘0’;

52ECE 645 – Computer Arithmetic ROM 16x1 (3) begin ROM_16X1_S_1: ram16x1s generic map (INIT => X"F0C1") port map (O=>DATA_OUT, A0=>ADDR(0), A1=>ADDR(1), A2=>ADDR(2), A3=>ADDR(3), D=>Low, WCLK=>Low, WE=>Low ); end ROM_16X1_DISTRIBUTED_STRUCTURAL;