FPGA Devices & FPGA Design Flow

Slides:



Advertisements
Similar presentations
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices & FPGA Design Flow ECE 448 Lecture 5.
Advertisements

Survey of Reconfigurable Logic Technologies
Basic FPGA Architecture © 2005 Xilinx, Inc. All Rights Reserved For Academic Use Only Virtex-II Architecture Virtex™-II architecture’s core voltage.
George Mason University FPGA Design Flow ECE 448 Lecture 9.
Spartan II Features  Plentiful logic and memory resources –15K to 200K system gates (up to 5,292 logic cells) –Up to 57 Kb block RAM storage  Flexible.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Lecture 11 Xilinx FPGA Memories
FPGA Devices & FPGA Design Flow
George Mason University ECE 645 – Computer Arithmetic Introduction to FPGA Devices.
ECE 448 Lecture 7 FPGA Devices
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Programmable logic and FPGA
February 4, 2002 John Wawrzynek
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Overview of Modern FPGAs ECE 448 Lecture 14.
CMPUT Computer Organization and Architecture II1 CMPUT329 - Fall 2003 Topic: Internal Organization of an FPGA José Nelson Amaral.
ECE 448: Spring 12 Lab 4 – Part 2 Finite State Machines Basys2 FPGA Board.
Basic Adders and Counters Implementation of Adders in FPGAs ECE 645: Lecture 3.
The Xilinx Spartan 3 FPGA EGRE 631 2/2/09. Basic types of FPGA’s One time programmable Reprogrammable (non-volatile) –Retains program when powered down.
ECE 448: Lab 4 FIR Filters.
ECE 448 – FPGA and ASIC Design with VHDL Lecture 10 Memories (RAM/ROM)
George Mason University FPGA Memories ECE 448 Lecture 13.
Ch.9 CPLD/FPGA Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
George Mason University FPGA Devices & FPGA Design Flow ECE 448 Lecture 7.
COE 405 Programmable Logic and Storage Devices Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum & Minerals Dr. Aiman.
System Arch 2008 (Fire Tom Wada) /10/9 Field Programmable Gate Array.
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
ECE 448 – FPGA and ASIC Design with VHDL Lecture 11 Memories in Xilinx FPGAs.
J. Christiansen, CERN - EP/MIC
George Mason University ECE 645 Lecture 7 FPGA Embedded Resources.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
Welcome to the ECE 449 Computer Design Lab Spring 2005.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
ECE 448 Lecture 6 FPGA devices
George Mason University ECE 449 – Computer Design Lab Introduction to FPGA Devices & Tools.
ECE 545 Lecture 7 FPGA Design Flow.
George Mason University ECE 449 – Computer Design Lab Welcome to the ECE 449 Computer Design Lab Spring 2004.
George Mason University ECE 448 FPGA and ASIC Design with VHDL FPGA Design Flow ECE 448 Lecture 7.
Sept. 2005EE37E Adv. Digital Electronics Lesson 1 (Part 2) FPGA Architectures.
Introduction to FPGA Tools
George Mason University ECE 448 – FPGA and ASIC Design with VHDL ECE 448 Lecture 10 Memories: RAM, ROM.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
George Mason University FPGA Design Flow ECE 545 Lecture 10.
CDA 4253 FGPA System Design Xilinx FPGA Memories
Lecture 10 Xilinx FPGA Memories Part 1
Survey of Reconfigurable Logic Technologies
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices ECE 448 Lecture 5.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Design Flow based on Aldec Active-HDL FPGA Board.
Lecture 11 Xilinx FPGA Memories Part 2
George Mason University FPGA Devices & FPGA Design Flow ECE 545 Lecture 8.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices & FPGA Design Flow ECE 448 Lecture 6.
George Mason University ECE 545 Lecture 12 FPGA Embedded Resources.
FPGA Devices & FPGA Design Flow
Programmable Logic Memories
COE 405 Programmable Logic and Storage Devices
FPGA Devices & FPGA Tools
Field Programmable Gate Array
Field Programmable Gate Array
Field Programmable Gate Array
Programmable Logic Memories
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
ECE 448 Lecture 7 FPGA Devices
ECE 448 Lecture 5 FPGA Devices
ECE 545 Lecture 17 RAM.
Basic Adders and Counters Implementation of Adders
ECE 448 Lecture 3 Combinational-Circuit Building Blocks Data Flow Modeling of Combinational Logic ECE 448 – FPGA and ASIC Design with VHDL.
Sequential Logic for Synthesis Based on Aldec Active-HDL
Pipelined Array Multiplier Aldec Active-HDL Design Flow
Presentation transcript:

FPGA Devices & FPGA Design Flow ECE 448 Lecture 7 FPGA Devices & FPGA Design Flow ECE 448 – FPGA and ASIC Design with VHDL

Reading Required P. Chu, FPGA Prototyping by VHDL Examples Chapter 2.2, FPGA Recommended S. Brown and Z. Vranesic, Fundamentals of Digital Logic with VHDL Design Chapter 3.6.5 Field-Programmable Gate Arrays ECE 448 – FPGA and ASIC Design with VHDL

Required Reading Xilinx, Inc. Spartan-3E FPGA Family Module 1: Introduction Features Architectural Overview Package Marking Module 2: Configurable Logic Block (CLB) and Slice Resources Dedicated Multipliers ECE 448 – FPGA and ASIC Design with VHDL

Two competing implementation approaches FPGA Field Programmable Gate Array ASIC Application Specific Integrated Circuit designed all the way from behavioral description to physical layout no physical layout design; design ends with a bitstream used to configure a device designs must be sent for expensive and time consuming fabrication in semiconductor foundry bought off the shelf and reconfigured by designers themselves ECE 448 – FPGA and ASIC Design with VHDL

What is an FPGA? Configurable Logic Blocks I/O Blocks Block RAMs ECE 448 – FPGA and ASIC Design with VHDL

Which Way to Go? ASICs FPGAs Off-the-shelf High performance Low development cost Low power Short time to market Low cost in high volumes Reconfigurability ECE 448 – FPGA and ASIC Design with VHDL

Other FPGA Advantages Manufacturing cycle for ASIC is very costly, lengthy and engages lots of manpower Mistakes not detected at design time have large impact on development time and cost FPGAs are perfect for rapid prototyping of digital circuits Easy upgrades like in case of software Unique applications reconfigurable computing ECE 448 – FPGA and ASIC Design with VHDL

Major FPGA Vendors SRAM-based FPGAs Xilinx, Inc. Altera Corp. Atmel Lattice Semiconductor Flash & antifuse FPGAs Actel Corp. Quick Logic Corp. Share about 90% of the market ECE 448 – FPGA and ASIC Design with VHDL

The Programmable Marketplace Q1 Calendar Year 2005 PLD Segment FPGA Sub-Segment Lattice QuickLogic: 2% Xilinx Actel Other: 2% 5% 7% 58% 33% 51% 31% 11% It is clear from these two charts that Xilinx is not only the clear leader in programmable logic products, but is also the leader in FPGA market share. This is due primarily to the fact that we produce products the meet the requirements of our customers. We understand the problems facing our customers and we make it our business to provide solutions to those problems Note: Atmel and Cypress number (each less than 1%) are not included in this calculation. Xilinx Altera Altera All Others Two dominant suppliers, indicating a maturing market Source: Company reports Latest information available; computed on a 4-quarter rolling basis ECE 448 – FPGA and ASIC Design with VHDL

ISE Alliance and Foundation Series Design Software Xilinx Primary products: FPGAs and the associated CAD software Main headquarters in San Jose, CA Fabless* Semiconductor and Software Company UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996} Seiko Epson (Japan) TSMC (Taiwan) Samsung (Korea) Programmable Logic Devices ISE Alliance and Foundation Series Design Software ECE 448 – FPGA and ASIC Design with VHDL

Xilinx FPGA Families Old families XC3000, XC4000, XC5200 Old 0.5µm, 0.35µm and 0.25µm technology. Not recommended for modern designs. High-performance families Virtex (220 nm) Virtex-E, Virtex-EM (180 nm) Virtex-II (130 nm) Virtex-II PRO (130 nm) Virtex-4 (90 nm) Virtex-5 (65 nm) Virtex-6 (40 nm) Low Cost Family Spartan/XL – derived from XC4000 Spartan-II – derived from Virtex Spartan-IIE – derived from Virtex-E Spartan-3 (90 nm) Spartan-3E (90 nm) – logic optimized Spartan-3A (90 nm) – I/O optimized Spartan-3AN (90 nm) – non-volatile, Spartan-3A DSP (90 nm) – DSP optimized Spartan-6 (45 nm) ECE 448 – FPGA and ASIC Design with VHDL

ECE 448 – FPGA and ASIC Design with VHDL

CLB Structure ECE 448 – FPGA and ASIC Design with VHDL

General structure of an FPGA The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) ECE 448 – FPGA and ASIC Design with VHDL

Xilinx CLB ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) ECE 448 – FPGA and ASIC Design with VHDL

CLB Structure ECE 448 – FPGA and ASIC Design with VHDL The configurable logic block (CLB) contains two slices. Each slice contains two 4-input look-up tables (LUT), carry & control logic and two registers. There are two 3-state buffers associated with each CLB, that can be accessed by all the outputs of a CLB. Xilinx is the only major FPGA vendor that provides dedicated resources for on-chip 3-state bussing. This feature can increase the performance and lower the CLB utilization for wide multiplex functions. The Xilinx internal bus can also be extended off chip. ECE 448 – FPGA and ASIC Design with VHDL

Xilinx CLB Slice ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) ECE 448 – FPGA and ASIC Design with VHDL

CLB Slice Structure Each slice contains two sets of the following: Four-input LUT Any 4-input logic function, or 16-bit x 1 sync RAM (SLICEM only) or 16-bit shift register (SLICEM only) Carry & Control Fast arithmetic logic Multiplier logic Multiplexer logic Storage element Latch or flip-flop Set and reset True or inverted inputs Sync. or async. control Two slices form a CLB. These slices can be used independently or together for wider logic functions.Within each slice also, the LUT and the flip flop can be used for the same function or for independent functions. The flip flops do not handcuff the designers into only having a set or clear. And for more ASIC like flows, the flip flop can be sued as latch. So, the designers do not need to re-code the design for the device architecture. ECE 448 – FPGA and ASIC Design with VHDL

LUT (Look-Up Table) Functionality Look-Up tables are primary elements for logic implementation Each LUT can implement any function of 4 inputs ECE 448 – FPGA and ASIC Design with VHDL

5-Input Functions implemented using two LUTs One CLB Slice can implement any function of 5 inputs Logic function is partitioned between two LUTs F5 multiplexer selects LUT ECE 448 – FPGA and ASIC Design with VHDL

5-Input Functions implemented using two LUTs OUT LUT ECE 448 – FPGA and ASIC Design with VHDL

Xilinx Multipurpose LUT The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) ECE 448 – FPGA and ASIC Design with VHDL

Simplified view of a Xilinx Logic Cell The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) ECE 448 – FPGA and ASIC Design with VHDL

Distributed RAM = or CLB LUT configurable as Distributed RAM RAM16X1S O D WE WCLK A0 A1 A2 A3 RAM32X1S A4 RAM16X2S O1 D0 D1 O0 = LUT or RAM16X1D SPO DPRA0 DPO DPRA1 DPRA2 DPRA3 CLB LUT configurable as Distributed RAM A single LUT equals 16x1 RAM Two LUTs Implement Single and Dual-Port RAMs Cascade LUTs to increase RAM size Synchronous write Synchronous/Asynchronous read Accompanying flip-flops used for synchronous read When the CLB LUT is configured as memory, it can implement 16x1 synchronous RAM. One LUT can implement 16x1 Single-Port RAM. Two LUTs are used to implement 16x1 dual port RAM. The LUTs can be cascaded for desired memory depth and width. The write operation is synchronous. The read operation is asynchronous and can be made synchronous by using the accompanying flip flops of the CLB LUT. The distributed ram is compact and fast which makes it ideal for small ram based functions. ECE 448 – FPGA and ASIC Design with VHDL

Shift Register = Each LUT can be configured as shift register Q CE LUT IN CLK DEPTH[3:0] OUT = Each LUT can be configured as shift register Serial in, serial out Dynamically addressable delay up to 16 cycles For programmable pipeline Cascade for greater cycle delays Use CLB flip-flops to add depth The LUT can be configured as a shift register (serial in, serial out) with bit width programmable from 1 to 16. For example, DEPTH[3:0] = 0010(binary) means that the shift register is 3-bit wide. In the simplest case, a 16 bit shift register can be implemented in a LUT, eliminating the need for 16 flip flops, and also eliminating extra routing resources that would have been lowered the performance otherwise. ECE 448 – FPGA and ASIC Design with VHDL

Shift Register Register-rich FPGA 64 Operation A 4 Cycles 8 Cycles Operation B 3 Cycles Operation C 12 Cycles 9-Cycle imbalance Register-rich FPGA Allows for addition of pipeline stages to increase throughput Data paths must be balanced to keep desired functionality In this example, there is a cycle imbalance, which must be fixed. Let’s think of how the shift register can fix the imbalanced cycles. As seen from the slide, the logic will be off by nine clock cycles. ECE 448 – FPGA and ASIC Design with VHDL

Carry & Control Logic SLICE ECE 448 – FPGA and ASIC Design with VHDL COUT YB Look-Up Table Carry & Control Logic Y G4 G3 G2 G1 S D Q O CK EC R F5IN BY SR XB Look-Up Table Carry & Control Logic X S F4 F3 F2 F1 D Q O The configurable logic block (CLB) contains two slices. Each slice contains two 4-input look-up tables (LUT), carry & control logic and two registers. There are two 3-state buffers associated with each CLB, that can be accessed by all the outputs of a CLB. Xilinx is the only major FPGA vendor that provides dedicated resources for on-chip 3-state bussing. This feature can increase the performance and lower the CLB utilization for wide multiplex functions. The Xilinx internal bus can also be extended off chip. CK EC R CIN CLK CE SLICE ECE 448 – FPGA and ASIC Design with VHDL

Full-adder x cout FA y s cin x + y + cin = ( cout s )2 x y cin cout s 1 1 1 1 1 2 1 x + y + cin = ( cout s )2

Alternative implementations Full-adder Alternative implementations x y cout s 1 1 1 cin cin cin cin cin cin

Alternative implementations Full-adder Alternative implementations Implementation used to generate fast carry logic in Xilinx FPGAs x y A2 A1 XOR D 1 Cin Cout S p g x y cout 1 cin p = x  y g = y s= p  cin = x  y  cin

Carry & Control Logic in Spartan 3 FPGAs LUT Hardwired (fast) logic

Critical Path for an Adder Implemented Using Xilinx Spartan 3/Spartan 3E FPGAs

Number and Length of Carry Chains for Spartan 3E FPGAs

Bottom Operand Input to Carry Out Delay TOPCYF 0.9 ns for Spartan 3

Carry Propagation Delay tBYP 0.2 ns for Spartan 3

Carry Input to Top Sum Combinational Output Delay TCINY 1.2 ns for Spartan 3

Critical Path Delays and Maximum Clock Frequencies (into account surrounding registers)

Fast Carry Logic Each CLB contains separate logic and routing for the fast generation of sum & carry signals Increases efficiency and performance of adders, subtractors, accumulators, comparators, and counters Carry logic is independent of normal logic and routing resources MSB Carry Logic Routing LSB ECE 448 – FPGA and ASIC Design with VHDL

Accessing Carry Logic All major synthesis tools can infer carry logic for arithmetic functions Addition (SUM <= A + B) Subtraction (DIFF <= A - B) Comparators (if A < B then…) Counters (count <= count +1) ECE 448 – FPGA and ASIC Design with VHDL

Embedded Multipliers ECE 448 – FPGA and ASIC Design with VHDL

RAM Blocks and Multipliers in Xilinx FPGAs The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) ECE 448 – FPGA and ASIC Design with VHDL

Dedicated Multiplier Block ECE 448 – FPGA and ASIC Design with VHDL

Interface of a Dedicated Multiplier ECE 448 – FPGA and ASIC Design with VHDL

Configurations of a Dedicated Multiplier ECE 448 – FPGA and ASIC Design with VHDL

Cascade of multipliers ECE 448 – FPGA and ASIC Design with VHDL

3 Ways to Use Dedicated Hardware Three (3) ways to use dedicated (embedded) hardware Inference Instantiation CoreGen

Inferred Multiplier library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity mult18x18 is generic ( word_size : natural := 18; signed_mult : boolean := true); port ( clk : in std_logic; a : in std_logic_vector(1*word_size-1 downto 0); b : in std_logic_vector(1*word_size-1 downto 0); c : out std_logic_vector(2*word_size-1 downto 0)); end entity mult18x18; architecture infer of mult18x18 is begin process(clk) if rising_edge(clk) then if signed_mult then c <= std_logic_vector(signed(a) * signed(b)); else c <= std_logic_vector(unsigned(a) * unsigned(b)); end if; end process; end architecture infer;

Forcing a particular implementation in VHDL Synthesis tool: Xilinx XST Attribute MULT_STYLE: string; Attribute MULT_STYLE of mult18x18: entity is block; Allowed values of the attribute: block – dedicated multiplier lut - LUT-based multiplier pipe_block – pipelined dedicated multiplier pipe_lut – pipelined LUT-based multiplier auto – automatic choice by the synthesis tool

Memories ECE 448 – FPGA and ASIC Design with VHDL

Memory Types Memory Memory Memory RAM ROM Single port Dual port With asynchronous read With synchronous read

Memory Types Memory Memory Distributed (MLUT-based) Block RAM-based (BRAM-based) Memory Inferred Instantiated Manually Using Core Generator

FPGA Distributed Memory

CLB Slice SLICE Carry & Control Logic Carry & Control Logic COUT YB Look-Up Table Carry & Control Logic Y G4 G3 G2 G1 S D Q O CK EC R F5IN BY SR XB Look-Up Table Carry & Control Logic X S F4 F3 F2 F1 D Q O CK EC R CIN CLK CE SLICE

Xilinx Multipurpose LUT The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Distributed RAM = or CLB LUT configurable as Distributed RAM RAM16X1S O D WE WCLK A0 A1 A2 A3 RAM32X1S A4 RAM16X2S O1 D0 D1 O0 = LUT or RAM16X1D SPO DPRA0 DPO DPRA1 DPRA2 DPRA3 CLB LUT configurable as Distributed RAM An LUT equals 16x1 RAM Cascade LUTs to increase RAM size Synchronous write Asynchronous read Can create a synchronous read by using extra flip-flops Naturally, distributed RAM read is asynchronous Two LUTs can make 32 x 1 single-port RAM 16 x 2 single-port RAM 16 x 1 dual-port RAM When the CLB LUT is configured as memory, it can implement 16x1 synchronous RAM. One LUT can implement 16x1 Single-Port RAM. Two LUTs are used to implement 16x1 dual port RAM. The LUTs can be cascaded for desired memory depth and width. The write operation is synchronous. The read operation is asynchronous and can be made synchronous by using the accompanying flip flops of the CLB LUT. The distributed ram is compact and fast which makes it ideal for small ram based functions.

FPGA Block RAM

Block RAM Most efficient memory implementation Spartan-3 Dual-Port Port A Port B Most efficient memory implementation Dedicated blocks of memory Ideal for most memory requirements 4 to 36 memory blocks in Spartan 3E 18 kbits = 18,432 bits per block (16 k without parity bits) Use multiple blocks for larger memories Builds both single and true dual-port RAMs Synchronous write and read (different from distributed RAM) The Block Ram is true dual port, which means it has 2 independent Read and Write ports and these ports can be read and/or written simultaneously, independent of each other. All control logic is implemented within the RAM so no additional CLB logic is required to implement dual port configuration. The Altera 10KE and ACEX 1K families have only 2-port RAM. To emulate dual port capability, they would need twice the number of memory blocks and at half the performance.

RAM Blocks and Multipliers in Xilinx FPGAs The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

Spartan-3E Block RAM Amounts

Block RAM can have various configurations (port aspect ratios) 1 2 4 4k x 4 8k x 2 4,095 16k x 1 8,191 8+1 2k x (8+1) 2047 16+2 1024 x (16+2) 1023 16,383

Block RAM Port Aspect Ratios

Single-Port Block RAM DO[w-p-1:0] DI[w-p-1:0]

Dual-Port Block RAM DOA[wA-pA-1:0] DIA[wA-pA-1:0] DOA[wB-pB-1:0] DIB[wB-pB-1:0]

Dual-Port Bus Flexibility RAMB4_S18_S9 WEA Port A In 1K-Bit Depth Port A Out 18-Bit Width ENA RSTA DOA[17:0] CLKA ADDRA[9:0] DIA[17:0] WEB Port B Out 9-Bit Width Port B In 2k-Bit Depth ENB RSTB DOB[8:0] CLKB ADDRB[10:0] DIB[8:0] Each port can be configured with a different data bus width Provides easy data width conversion without any additional logic

Two Independent Single-Port RAMs RAMB4_S1_S1 Port A In 8K-Bit Depth DOA[0] DOB[0] WEA ENA RSTA ADDRA[12:0] CLKA DIA[0] WEB ENB RSTB ADDRB[12:0] CLKB DIB[0] Port A Out 1-Bit Width 0, ADDR[12:0] Port B In 8K-Bit Depth Port B Out 1-Bit Width 1, ADDR[12:0] To access the lower RAM Tie the MSB address bit to Logic Low To access the upper RAM Tie the MSB address bit to Logic High Added advantage of True Dual-Port No wasted RAM Bits Can split a Dual-Port 16K RAM into two Single-Port 8K RAM Simultaneous independent access to each RAM

Generic Inferred ROM

Distributed ROM with asynchronous read LIBRARY ieee; USE ieee.std_logic_1164.all; USE ieee.std_logic_arith.all; entity rominfr is generic ( bits : integer := 10; -- number of bits per ROM word addr_bits : integer := 3); -- 2^addr_bits = number of words in ROM port (a : in std_logic_vector(addr_bits-1 downto 0); do : out std_logic_vector(bits-1 downto 0)); end rominfr;

Distributed ROM with asynchronous read architecture behavioral of rominfr is type rom_type is array (2**addr_bits-1 downto 0) of std_logic_vector (bits-1 downto 0); constant ROM : rom_type := ("0000110001", "0100110100", "0100110110", "0110110000", "0000111100", "0111110101", "1111100111"); begin do <= ROM(conv_integer(unsigned(a))); end behavioral;

Distributed ROM with synchronous read LIBRARY ieee; USE ieee.std_logic_1164.all; USE ieee.std_logic_arith.all; USE ieee.std_logic_unsigned.all; entity rominfr is generic ( bits : integer := 10; -- number of bits per ROM word addr_bits : integer := 3); -- 2^addr_bits = number of words in ROM port (a : in std_logic_vector(addr_bits-1 downto 0); clk : in std_logic; en : in std_logic; do : out std_logic_vector(bits-1 downto 0)); end rominfr;

Distributed ROM with synchronous read architecture behavioral of rominfr is type rom_type is array (2**addr_bits-1 downto 0) of std_logic_vector (bits-1 downto 0); constant ROM : rom_type := ("0000110001", "0100110100", "0100110110", "0110110000", "0000111100", "0111110101", "1111100111"); begin process(clk) if rising_edge(clk) then if en = ‘1’ then do <= ROM(conv_integer(unsigned(a))); end if; end process; end behavioral;

Forcing a particular implementation in VHDL Synthesis tool: Xilinx XST Attribute ROM_STYLE: string; Attribute ROM_STYLE of rominfr: entity is block; Allowed values of the attribute: block – Block RAM distributed- distributed (LUT-based) memory auto – automatic choice by the synthesis tool

Specification of memory types recognized by Synplify Pro SIGNAL memory : vector_array; Block RAM Memory: attribute syn_ramstyle : string; attribute syn_ramstyle of memory : signal is "block_ram"; LUT-based Distributed Memory: attribute syn_ramstyle : string; attribute syn_ramstyle of memory : signal is “select_ram";

Report from Synthesis Resource Usage Report for raminfr Mapping to part: xc3s50pq208-5 Cell usage: GND 1 use RAMB16_S36 1 use VCC 1 use I/O ports: 69 I/O primitives: 68 IBUF 36 uses OBUF 32 uses BUFGP 1 use I/O Register bits: 0 Register bits not including I/Os: 0 (0%) RAM/ROM usage summary Block Rams : 1 of 4 (25%) Global Clock Buffers: 1 of 8 (12%) Mapping Summary: Total LUTs: 0 (0%)

Report from Implementation Design Summary: Number of errors: 0 Number of warnings: 0 Logic Utilization: Logic Distribution: Number of Slices containing only related logic: 0 out of 0 0% Number of Slices containing unrelated logic: 0 out of 0 0% *See NOTES below for an explanation of the effects of unrelated logic Number of bonded IOBs: 69 out of 124 55% Number of Block RAMs: 1 out of 4 25% Number of GCLKs: 1 out of 8 12%

Input/Output Blocks (IOBs) ECE 448 – FPGA and ASIC Design with VHDL

Basic I/O Block Structure Three-State D Q FF Enable EC Three-State Control Clock SR Set/Reset Output D Q FF Enable EC Output Path SR Direct Input FF Enable Input Path Registered Input Q D EC SR ECE 448 – FPGA and ASIC Design with VHDL

IOB Functionality IOB provides interface between the package pins and CLBs Each IOB can work as uni- or bi-directional I/O Outputs can be forced into High Impedance Inputs and outputs can be registered advised for high-performance I/O Inputs can be delayed ECE 448 – FPGA and ASIC Design with VHDL

Spartan-3E Family Attributes ECE 448 – FPGA and ASIC Design with VHDL

Spartan-3E FPGA Family Members ECE 448 – FPGA and ASIC Design with VHDL

FPGA Nomenclature ECE 448 – FPGA and ASIC Design with VHDL

FPGA device present on the Digilent Basys2 board XC3S100E-4CP132 Spartan 3E family 100 k equivalent logic gates speed grade -4 = standard performance 132 pins package type ECE 448 – FPGA and ASIC Design with VHDL

FPGA Design Flow ECE 448 – FPGA and ASIC Design with VHDL

Design flow (1) Specification (Lab Experiments) Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds….. Specification (Lab Experiments) VHDL description (Your Source Files) Library IEEE; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; ); end AES_core; Functional simulation Synthesis Post-synthesis simulation ECE 448 – FPGA and ASIC Design with VHDL

Design flow (2) Implementation Timing simulation Configuration On chip testing ECE 448 – FPGA and ASIC Design with VHDL

Tools used in FPGA Design Flow Functionally verified VHDL code Design VHDL code Synplicity Synplify Pro Xilinx XST Synthesis Netlist Implementation Xilinx ISE Bitstream

Synthesis ECE 448 – FPGA and ASIC Design with VHDL

Synthesis Tools … and others Xilinx XST Synplify Pro ECE 448 – FPGA and ASIC Design with VHDL

Logic Synthesis VHDL description Circuit netlist architecture MLU_DATAFLOW of MLU is signal A1:STD_LOGIC; signal B1:STD_LOGIC; signal Y1:STD_LOGIC; signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC; begin A1<=A when (NEG_A='0') else not A; B1<=B when (NEG_B='0') else not B; Y<=Y1 when (NEG_Y='0') else not Y1; MUX_0<=A1 and B1; MUX_1<=A1 or B1; MUX_2<=A1 xor B1; MUX_3<=A1 xnor B1; with (L1 & L0) select Y1<=MUX_0 when "00", MUX_1 when "01", MUX_2 when "10", MUX_3 when others; end MLU_DATAFLOW; ECE 448 – FPGA and ASIC Design with VHDL

Circuit netlist (RTL view) ECE 448 – FPGA and ASIC Design with VHDL

Mapping LUT0 LUT4 LUT1 FF1 LUT5 LUT2 FF2 LUT3 ECE 448 – FPGA and ASIC Design with VHDL

RTL view in Synplify Pro General logic structures can be recognized in RTL view comparator incrementer MUX

Crossprobing between RTL view and code Each port, net or block can be chosen by mouse click from the browser or directly from the RTL View By double-clicking on the element its source code can be seen: Reverse crossprobing is also possible: if section of code is marked, appropriate element of RTL View is marked too:

Technology View in Synplify Pro Technology view is a mapped RTL view. It can be seen by pressing button or by double-click on “.srm” file As in case of “RTL View”, buttons can be used here Two additional buttons are enabled: - show critical path - open timing analyst Pay attention: technology view is usually large and presented on number of sheets Technology view is presented using device primitives Ports, nets and blocks browser

Viewing critical path Critical path can be viewed by pressing on Delay values are written near each component of the path

Timing Analyst Timing analyst opened by pressing on Timing analyst gives a possibility to analyze different paths in the design Timing analyst can be opened only from Technology View

Implementation ECE 448 – FPGA and ASIC Design with VHDL

Implementation After synthesis the entire implementation process is performed by FPGA vendor tools ECE 448 – FPGA and ASIC Design with VHDL

ECE 448 – FPGA and ASIC Design with VHDL

Translation Circuit netlist Timing Constraints Native Constraint File Synthesis Circuit netlist Timing Constraints Constraint Editor or Text Editor Native Constraint File Electronic Design Interchange Format EDIF NCF UCF User Constraint File Translation NGD Native Generic Database file ECE 448 – FPGA and ASIC Design with VHDL

Pin Assignment FPGA LAB2 CLOCK CONTROL(0) CONTROL(2) CONTROL(1) RESET SEGMENTS(0) SEGMENTS(1) SEGMENTS(2) SEGMENTS(3) SEGMENTS(4) SEGMENTS(5) SEGMENTS(6) H3 K2 G5 K3 H1 K4 G4 H5 H6 H2 P10 B10 FPGA ECE 448 – FPGA and ASIC Design with VHDL

ECE 448 – FPGA and ASIC Design with VHDL

Example of an UCF File NET "CLOCK" LOC = "P10"; NET "reset" LOC = "B10"; NET "S_SEG0<6>" LOC = "H1"; NET "S_SEG0<5>"LOC = "G4"; NET "S_SEG0<4>"LOC = "G5"; NET "S_SEG0<3>"LOC = "H5"; NET "S_SEG0<2>"LOC = "H6"; NET "S_SEG0<1>"LOC = "H3"; NET "S_SEG0<0>"LOC = "H2"; ECE 448 – FPGA and ASIC Design with VHDL

Mapping LUT0 LUT4 LUT1 FF1 LUT5 LUT2 FF2 LUT3 ECE 448 – FPGA and ASIC Design with VHDL

Placing FPGA CLB SLICES ECE 448 – FPGA and ASIC Design with VHDL

Routing FPGA Programmable Connections ECE 448 – FPGA and ASIC Design with VHDL

Configuration Once a design is implemented, you must create a file that the FPGA can understand This file is called a bit stream: a BIT file (.bit extension) The BIT file can be downloaded directly to the FPGA, or can be converted into a PROM file which stores the programming information ECE 448 – FPGA and ASIC Design with VHDL

Two main stages of the FPGA Design Flow Synthesis Implementation Technology dependent Technology independent RTL Synthesis Map Place & Route Configure Code analysis - Derivation of main logic constructions Technology independent optimization Creation of “RTL View” Mapping of extracted logic structures to device primitives Technology dependent optimization Application of “synthesis constraints” Netlist generation Creation of “Technology View” Placement of generated netlist onto the device Choosing best interconnect structure for the placed design Application of “physical constraints” Bitstream generation Burning device

Report files ECE 448 – FPGA and ASIC Design with VHDL

Map report header Release 8.1i Map I.24 Xilinx Mapping Report File for Design 'Lab3Demo' Design Information ------------------ Command Line : c:\Xilinx\bin\nt\map.exe -p 3S1500FG320-4 -o map.ncd -pr b -k 4 -cm area -c 100 Lab3Demo.ngd Lab3Demo.pcf Target Device : xc3s1500 Target Package : fg320 Target Speed : -4 Mapper Version : spartan3 -- $Revision: 1.34 $ Mapped Date : Tue Feb 13 17:04:54 2007 ECE 448 – FPGA and ASIC Design with VHDL

Map report Design Summary -------------- Number of errors: 0 Number of warnings: 0 Logic Utilization: Number of Slice Flip Flops: 30 out of 26,624 1% Number of 4 input LUTs: 38 out of 26,624 1% Logic Distribution: Number of occupied Slices: 33 out of 13,312 1% Number of Slices containing only related logic: 33 out of 33 100% Number of Slices containing unrelated logic: 0 out of 33 0% *See NOTES below for an explanation of the effects of unrelated logic Total Number 4 input LUTs: 62 out of 26,624 1% Number used as logic: 38 Number used as a route-thru: 24 Number of bonded IOBs: 10 out of 221 4% IOB Flip Flops: 7 Number of GCLKs: 1 out of 8 12% ECE 448 – FPGA and ASIC Design with VHDL

Place & route report ECE 448 – FPGA and ASIC Design with VHDL Asterisk (*) preceding a constraint indicates it was not met. This may be due to a setup or hold violation. ------------------------------------------------------------------------------------------------------ Constraint | Requested | Actual | Logic | Absolute |Number of | | | Levels | Slack |errors * TS_CLOCK = PERIOD TIMEGRP "CLOCK" 5 ns | 5.000ns | 5.140ns | 4 | -0.140ns | 5 HIGH 50% | | | | | TS_gen1Hz_Clock1Hz = PERIOD TIMEGRP "gen1 | 5.000ns | 4.137ns | 2 | 0.863ns | 0 "gen1Hz_Clock1Hz" 5 ns HIGH 50% | | | | | ECE 448 – FPGA and ASIC Design with VHDL

Post layout timing report Clock to Setup on destination clock CLOCK ---------------+---------+---------+---------+---------+ | Src:Rise| Src:Fall| Src:Rise| Src:Fall| Source Clock |Dest:Rise|Dest:Rise|Dest:Fall|Dest:Fall| CLOCK | 5.140| | | | Timing summary: --------------- Timing errors: 9 Score: 543 Constraints cover 574 paths, 0 nets, and 187 connections Design statistics: Minimum period: 5.140ns (Maximum frequency: 194.553MHz) ECE 448 – FPGA and ASIC Design with VHDL