George Mason University ATHENa - Automated Tool for Hardware EvaluatioN Modern FPGA Families ECE 545 Lecture 12.

Slides:



Advertisements
Similar presentations
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Advertisements

Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Altera FLEX 10K technology in Real Time Application.
Survey of Reconfigurable Logic Technologies
Comprehensive environment for benchmarking using FPGAs: ATHENa - Automated Tool for Hardware EvaluatioN 1.
Graduate Computer Architecture I Lecture 15: Intro to Reconfigurable Devices.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Overview of Modern FPGAs ECE 448 Lecture 14.
FPGA Devices & FPGA Design Flow
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
Spring 08, Jan 15 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
Spring 07, Jan 16 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.
Configurable System-on-Chip: Xilinx EDK
Programmable logic and FPGA
George Mason University ECE 448 – FPGA and ASIC Design with VHDL Overview of Modern FPGAs ECE 448 Lecture 14.
Features of Modern FPGAs
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Foundation and XACTstepTM Software
Digital System Design EEE344 Lecture 1 INTRODUCTION TO THE COURSE
Basic Adders and Counters Implementation of Adders in FPGAs ECE 645: Lecture 3.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
© 2011 Xilinx, Inc. All Rights Reserved Intro to System Generator This material exempt per Department of Commerce license exception TSU.
© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
Delevopment Tools Beyond HDL
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Lecture #3 Page 1 ECE 4110– Sequential Logic Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.No Class Monday, Labor Day Holiday 2.HW#2 assigned.
George Mason University FPGA Memories ECE 448 Lecture 13.
Ch.9 CPLD/FPGA Design TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.
ASIC/FPGA design flow. FPGA Design Flow Detailed (RTL) Design Detailed (RTL) Design Ideas (Specifications) Design Ideas (Specifications) Device Programming.
George Mason University Modern FPGA Devices ATHENa - Automated Tool for Hardware EvaluatioN ECE 545 Lecture 11.
Electronics in High Energy Physics Introduction to Electronics in HEP Field Programmable Gate Arrays Part 1 based on the lecture of S.Haas.
GBT Interface Card for a Linux Computer Carson Teale 1.
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
Follow-up Courses. ECE Department MS in Electrical Engineering MS EE MS in Computer Engineering MS CpE COMMUNICATIONS & NETWORKING SIGNAL PROCESSING CONTROL.
SHA-3 Candidate Evaluation 1. FPGA Benchmarking - Phase Round-2 SHA-3 Candidates implemented by 33 graduate students following the same design.
J. Christiansen, CERN - EP/MIC
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
VHDL Project Specification Naser Mohammadzadeh. Schedule  due date: Tir 18 th 2.
ECE 545 Digital System Design with VHDL
1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.
Introduction to FPGA Created & Presented By Ali Masoudi For Advanced Digital Communication Lab (ADC-Lab) At Isfahan University Of technology (IUT) Department.
George Mason University ECE 448 – FPGA and ASIC Design with VHDL ASICs vs. FPGAs ECE 448 Lecture 15.
Lecture #3 Page 1 ECE 4110–5110 Digital System Design Lecture #3 Agenda 1.FPGA's 2.Lab Setup Announcements 1.HW#2 assigned Due.
ECE 545 Project 2 Specification. Schedule of Projects (1) Project 1 RTL design for FPGAs (20 points) Due date: Tuesday, November 22, midnight (firm) Checkpoints:
ECE 545 Project 2 Specification. Project 2 (15 points) – due Tuesday, December 19, noon Application: cryptography OR digital signal processing optimized.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Computer Engineering 1502 Advanced Digital Design Professor Donald Chiarulli Computer Science Dept Sennott Square
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
CORE Generator System V3.1i
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
George Mason University Follow-up Courses. ECE Department MS in Electrical Engineering MS EE MS in Computer Engineering MS CpE COMMUNICATIONS & NETWORKING.
CDA 4253 FGPA System Design Xilinx FPGA Memories
& FPGA Embedded Resources
Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May – 9 June 2007 Javier.
Lecture 10 Xilinx FPGA Memories Part 1
Survey of Reconfigurable Logic Technologies
George Mason University ECE 448 – FPGA and ASIC Design with VHDL FPGA Devices ECE 448 Lecture 5.
George Mason University FPGA Memories ATHENa - Automated Tool for Hardware EvaluatioN ECE 545 Lecture 10.
George Mason University ATHENa - Automated Tool for Hardware EvaluatioN ECE 545 Lecture 12.
Introduction to the FPGA and Labs
Programmable Logic Devices
ATHENa - Automated Tool for Hardware EvaluatioN
ELEC 7770 Advanced VLSI Design Spring 2016 Introduction
Introduction to Programmable Logic
Programmable Logic Memories
ELEC 7770 Advanced VLSI Design Spring 2014 Introduction
ELEC 7770 Advanced VLSI Design Spring 2012 Introduction
ELEC 7770 Advanced VLSI Design Spring 2010 Introduction
Basic Adders and Counters Implementation of Adders
THE ECE 554 XILINX DESIGN PROCESS
THE ECE 554 XILINX DESIGN PROCESS
Presentation transcript:

George Mason University ATHENa - Automated Tool for Hardware EvaluatioN Modern FPGA Families ECE 545 Lecture 12

George Mason University ATHENa

3 Resources ATHENa website

4 ATHENa – Automated Tool for Hardware EvaluatioN Supported in part by the National Institute of Standards & Technology (NIST)

ATHENa Team Venkata “Vinny” MS CpE student Ekawat “Ice” PhD CpE student Marcin PhD ECE student Rajesh PhD ECE student Michal PhD exchange student from Slovakia John MS CpE student

ATHENa – A utomated T ool for H ardware E valuatio N 6 Benchmarking open-source tool, written in Perl, aimed at an AUTOMATED generation of OPTIMIZED results for MULTIPLE hardware platforms Currently under development at George Mason University.

Why Athena? 7 "The Greek goddess Athena was frequently called upon to settle disputes between the gods or various mortals. Athena Goddess of Wisdom was known for her superb logic and intellect. Her decisions were usually well-considered, highly ethical, and seldom motivated by self-interest.” from "Athena, Greek Goddess of Wisdom and Craftsmanship"

ATHENa Server FPGA Synthesis and Implementation Result Summary + Database Entries 2 3 HDL + scripts + configuration files 1 Database Entries Download scripts and configuration files8 Designer 4 HDL + FPGA Tools User Database query Ranking of designs 5 6 Basic Dataflow of ATHENa 0 Interfaces + Testbenches 8

9 synthesizable source files configuration files testbench constraint files result summary (user-friendly) result summary (user-friendly) database entries (machine- friendly) database entries (machine- friendly)

ATHENa Major Features (1) synthesis, implementation, and timing analysis in batch mode support for devices and tools of multiple FPGA vendors: generation of results for multiple families of FPGAs of a given vendor automated choice of a best-matching device within a given family 10

ATHENa Major Features (2) automated verification of designs through simulation in batch mode support for multi-core processing automated extraction and tabulation of results several optimization strategies aimed at finding – optimum options of tools – best target clock frequency – best starting point of placement OR 11

12 batch mode of FPGA tools ease of extraction and tabulation of results Text Reports, Excel, CSV (Comma-Separated Values) optimized choice of tool options GMU_optimization_1 strategy Generation of Results Facilitated by ATHENa vs.

13 Relative Improvement of Results from Using ATHENa Virtex 5, 256-bit Variants of Hash Functions Ratios of results obtained using ATHENa suggested options vs. default options of FPGA tools

14 Other (Somewhat) Similar Tools ExploreAhead (part of PlanAhead) Design Space Explorer (DSE) Boldport Flow EDAx10 Cloud Platform

15 Distinguishing Features of ATHENa Support for multiple tools from multiple vendors Optimization strategies aimed at the best possible performance rather than design closure Extraction and presentation of results Seamless integration with the ATHENa database of results

Read the Tutorial! Install the Required Tools (see Tutorial - Part 1 – Tools Installation) Run ATHENa_setup How To Start Working With ATHENa? One-Time Tasks Download and unzip ATHENa

Modify design.config.txt + possibly other configuration files Run ATHENa How To Start Working With ATHENa? Repetitive Tasks Prepare or modify your source files & source_list.txt

design.config.txt Your Design # directory containing synthesizable source files for the project SOURCE_DIR = # A file list containing list of files in the order suitable for synthesis and implementation # low level modules first, top level entity last SOURCE_LIST_FILE = source_list.txt # project name # it will be used in the names of result directories PROJECT_NAME = SHA256 # name of top level entity TOP_LEVEL_ENTITY = sha256 # name of top level architecture TOP_LEVEL_ARCH = rs_arch # name of clock net CLOCK_NET = clk

design.config.txt Timing Formulas #formula for latency LATENCY = TCLK*65 #formula for throughput THROUGHPUT = 512/(TCLK*65)

design.config.txt Application & Optimization Target # OPTIMIZATION_TARGET = speed | area | balanced OPTIMIZATION_TARGET = speed # OPTIONS = default | user OPTIONS = default # APPLICATION = single_run | exhaustive_search | placement_search | frequency_search | # GMU_Optimization_1 | GMU_Xilinx_optimization_1 APPLICATION = single_run # TRIM_MODE = off | zip | delete TRIM_MODE = zip

design.config.txt FPGA Families # commenting the next line removes all families of Xilinx FPGA_VENDOR = xilinx #commenting the next line removes a given family FPGA_FAMILY = spartan3 # FPGA_DEVICES = | best_match | all FPGA_DEVICES = best_match SYN_CONSTRAINT_FILE = default IMP_CONSTRAINT_FILE = default REQ_SYN_FREQ = 120 REQ_IMP_FREQ = 100 MAX_SLICE_UTILIZATION = 0.8 MAX_BRAM_UTILIZATION = 0.8 MAX_MUL_UTILIZATION = 1 MAX_PIN_UTILIZATION = 0.9 END FAMILY END VENDOR

design.config.txt FPGA Families # commenting the next line removes all families of Altera FPGA_VENDOR = altera #commenting the next line removes a given family FPGA_FAMILY = Stratix III # FPGA_DEVICES = | best_match | all FPGA_DEVICES = best_match SYN_CONSTRAINT_FILE = default IMP_CONSTRAINT_FILE = default REQ_IMP_FREQ = 120 MAX_LOGIC_UTILIZATION = 0.8 MAX_MEMORY_UTILIZATION = 0.8 MAX_DSP_UTILIZATION = 0 MAX_MUL_UTILIZATION = 0 MAX_PIN_UTILIZATION = 0.8 END FAMILY END VENDOR

Library Files device_lib/xilinx_device_lib.txt device_lib/altera_device_lib.txt Files created during ATHENa setup Characterize FPGA families and devices available in the version of Xilinx and Altera tools installed on your computer Currently supported tool versions: – Xilinx WebPACK 9.1, 9.2, 10.1, 11.1, 11.5, 12.1, 12.2, 12.3 – Xilinx Design Suite11.1, 12.1, 12.2, 12.3 – Altera Quartus II Web Edition8.1, 8.2, 9.0, 9.1, 10.0 – Altera Quartus II Subscription Edition9.1, 10.0 In case a library for a given version not available yet, use a library from the closest available version

Library Files device_lib/xilinx_device_lib.txt VENDOR = Xilinx #Device, Total Slices, Block RAMs, DSP, Dedicated Multipliers, Maximum User I/O Pins ITEM_ORDER = SLICE, BRAM, DSP, MULT, IO FAMILY = spartan3 xc3s50pq208-5, 768,4, 0, 4, 124 xc3s200ft256-5, 1920, 12, 0, 12, 173 xc3s400fg456-5, 3584, 16, 0, 16, 264 xc3s1000fg676-5, 7680, 24, 0, 24, 391 xc3s1500fg676-5, 13312, 32, 0, 32, 487 END_FAMILY FAMILY = virtex5 xc5vlx30ff676-3, 4800, 32, 32, 0, 400 xc5vfx30tff665-3, 5120, 68, 64, 0, 360 xc5vlx30tff665-3, 4800, 36, 32, 0, 360 xc5vlx50ff1153-3, 7200, 48, 48, 0, 560 xc5vlx50tff1136-3, 7200, 60, 48, 0, 480 END_FAMILY

Result Files report_resource_utilization.txt xilinx : spartan | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | | default | xc3s200ft256-5* | 1 | 142 | 3 | 74 | 3 | 4 | 33 | 7 | 58 | 0 | 0 | 20 | 11 | xilinx : spartan | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | | default | xc6slx9csg324-3* | 1 | 41 | 1 | 22 | 1 | 4 | 6 | 0 | 0 | 9 | 56 | 20 | 10 | xilinx : virtex | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | | default | xc5vlx20tff323-2* | 1 | 101 | 1 | 56 | 1 | 4 | 15 | 0 | 0 | 9 | 37 | 20 | 11 | xilinx : virtex | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | | default | xc6vlx75tff784-3* | 1 | 44 | 1 | 21 | 1 | 4 | 1 | 0 | 0 | 9 | 3 | 20 | 5 |

Result Files report_timing.txt REQ SYN FREQ- Requested synthesis clk freq.SYN FREQ – Achieved synthesis clk. freq. REQ SYN TCLK- Requested synthesis clk periodSYN TCLK – Achieved synthesis clk. period REQ IMP FREQ- Requested implement. clk freq.IMP FREQ – Achieved implement. clk. freq. REQ IMP TCLK- Requested implement. clk periodIMP TCLK – Achieved implement clk. period LATENCY- Latency [ns]THROUGHPUT – Throughput [Mbits/s] TP/Area - Throughput/Area [(Mbits/s)/CLB slicesLatency*Area – Latency*Area [ns*CLB slices] xilinx : spartan | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | | default | xc3s200ft256-5* | 1 | default | | default | | default | | default | | | | | | xilinx : spartan | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | | default | xc6slx9csg324-3* | 1 | default | | default | | default | | default | | | | | | xilinx : virtex | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | | default | xc5vlx20tff323-2* | 1 | default | | default | | default | | default | | | | | | xilinx : virtex | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | | default | xc6vlx75tff784-3* | 1 | default | | default | | default | | default | | | | | |

Result Files report_options.txt xilinx : spartan | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | | default | xc3s200ft256-5* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b -cm speed | -w -ol std | xilinx : spartan | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | | default | xc6slx9csg324-3* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b | -w -ol std | xilinx : virtex | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | | default | xc5vlx20tff323-2* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b -cm speed | -w -ol std | xilinx : virtex | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | | default | xc6vlx75tff784-3* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b | -w -ol std | COST TABLE - parameter determining the starting point of placement Synthesis Options – options of the synthesis tool Map Options – Options of the mapping tool PAR Options – Options of the place & route tool

Result Files report_execution_time.txt xilinx : spartan | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | | default | xc3s200ft256-5* | 1 | 0d 0h:0m:12s | 0d 0h:0m:36s | 0d 0h:0m:48s | xilinx : spartan | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | | default | xc6slx9csg324-3* | 1 | 0d 0h:0m:21s | 0d 0h:1m:13s | 0d 0h:1m:34s | xilinx : virtex | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | | default | xc5vlx20tff323-2* | 1 | 0d 0h:0m:39s | 0d 0h:1m:50s | 0d 0h:2m:29s | xilinx : virtex | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | | default | xc6vlx75tff784-3* | 1 | 0d 0h:0m:22s | 0d 0h:3m:22s | 0d 0h:3m:44s | Synthesis Time- Time of Synthesis Implementation Time- Time of Implementation Elapsed Time - Total Time

design.config.txt Functional Simulation (1) # FUNCTIONAL_VERFICATION_MODE = FUNCTIONAL_VERIFICATION_MODE = # directory containing source files of the testbench VERIFICATION_DIR = # A file containing a list of testbench files in the order suitable for compilation; # low level modules first, top level entity last. # Test vector files should be located in the same directory and listed # in the same file, unless fixed path is used. Please refer to tutorial for more detail. VERIFICATION_LIST_FILE = # name of testbench's top level entity TB_TOP_LEVEL_ENTITY = # name of testbench's top level architecture TB_TOP_LEVEL_ARCH =

design.config.txt Functional Simulation (2) # MAX_TIME_FUNCTIONAL_VERIFICATION = #supported unit are : ps, ns, us, and ms #if blank, simulation will run until it finishes = # = no changes in signals, i.e., clock is stopped and no more inputs coming in. MAX_TIME_FUNCTIONAL_VERIFICATION = <> # Perform only verification (synthesis and implementation parameters are ignored) # VERIFICATION_ONLY = VERIFICATION_ONLY =

31 ATHENa – Database of Results ATHENa – Database of Results

32 ATHENa Database

33 ATHENa Database – Result View Algorithm parameters Design parameters  Optimization target  Architecture type  Datapath width  I/O bus widths  Availability of source code  Platform  Vendor, Family, Device  Timing  Maximum clock frequency  Maximum throughput  Resource utilization  Logic blocks (Slices/LEs/ALUTs)  Multipliers/DSP units  Tools  Names & versions  Detailed options  Credits  Designers & contact information

34 ATHENa Database – Compare Feature Matching fields in grey Non-matching fields in red and blue

35 Possible Future Customizations The same basic database can be customized and adapted for other domains, such as Digital Signal Processing Bioinformatics Communications Scientific Computing, etc.

36 ATHENa - Website

37 ATHENa Website Download of ATHENa Tool Links to related tools SHA-3 Competition in FPGAs & ASICs Specifications of candidates Interface proposals RTL source codes Testbenches ATHENa database of results Related papers & presentations

38 First batch of GMU Source Codes for all Round 3 SHA-3 Candidates & SHA-2 made available at the ATHENa website at: Included in this release: Basic architectures Folded architectures Unrolled architectures Each code supports two variants: with 256-bit and 512-bit output. Each source code accompanied by comprehensive hierarchical block diagrams GMU Source Codes and Block Diagrams

39 ATHENa Result Replication Files Scripts and configuration files sufficient to easily reproduce all results (without repeating optimizations) Automatically created by ATHENa for all results generated using ATHENa Stored in the ATHENa Database In the same spirit of Reproducible Research as: Patrick Vandewalle 1, Jelena Kovacevic 2, and Martin Vetterli 1 ( 1 EPFL, 2 CMU) Reproducible research in signal processing - what, why, and how. IEEE Signal Processing Magazine, May J. Claerbout (Stanford University) “Electronic documents give reproducible research a new meaning,” in Proc. 62nd Ann. Int. Meeting of the Soc. of Exploration Geophysics, 1992,

40 Benchmarking Goals Facilitated by ATHENa 1.cryptographic algorithms 2.hardware architectures or implementations of the same cryptographic algorithm 3.hardware platforms from the point of view of their suitability for the implementation of a given algorithm, (e.g., choice of an FPGA device or FPGA board) 4.tools and languages in terms of quality of results they generate (e.g. Verilog vs. VHDL, Synplicity Synplify Premier vs. Xilinx XST, ISE v vs. ISE v. 12.3) Comparing multiple:

George Mason University Modern FPGA Families

42ECE 448 – FPGA and ASIC Design with VHDL Resources Xcell Journal available for FREE on Electronic Engineering Journal available for FREE by after or on the

43ECE 448 – FPGA and ASIC Design with VHDL FPGA Vendors & Families

44ECE 448 – FPGA and ASIC Design with VHDL Major FPGA Vendors SRAM-based FPGAs Xilinx, Inc. Altera Corp. Lattice Semiconductor Atmel Achronix Tabula Flash & antifuse FPGAs Actel Corp. (Microsemi SoC Products Group) Quick Logic Corp. ~ 51% of the market ~ 34% of the market ~ 85%

TechnologyLow-costHigh-performance 220 nmSpartan IIVirtex 120/150 nmVirtex II, II Pro 90 nmSpartan 3Virtex 4 65 nmVirtex 5 45 nmSpartan 6 40 nmVirtex 6 28 nmArtix 7Virtex 7 Xilinx FPGA Devices

Altera FPGA Devices TechnologyLow-costMid-rangeHigh- performance 130 nmCycloneStratix 90 nmCyclone IIStratix II 65 nmCyclone IIIArria IStratix III 40 nmCyclone IVArria IIStratix IV 28 nmCyclone VArria VStratix V

ECE 448 – FPGA and ASIC Design with VHDL LUTs & ALUTs

48ECE 448 – FPGA and ASIC Design with VHDL 4-bit LUTs vs. 6-bit LUTs 6-bit LUTs introduced in Virtex 5

49 Major Differences between Xilinx Families Number of CLB slices per CLB Number of LUTs per CLB slice Look-Up Tables Spartan 3 Virtex 4 Virtex 5, Virtex 6, Spartan 6 4-input6-input

50ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Xilinx Spartan CLB

Virtex 5 Arrangement of Slices within the CLB

ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Spartan 3 Multipurpose LUT (MLUT)

Virtex 5 64 x 1 Single Port RAM

54 Major Differences between Xilinx Families Maximum Shift Register Size per LUT Maximum Single-Port Memory Size per LUT Number of adder stages per CLB slice Spartan 3 Virtex 4 Virtex 5, Virtex 6, Spartan 6 16 x 164 x 1 16 bits 2 32 bits 4

Virtex 5 32-bit Shift Register, SRL

Altera Cyclone III Logic Element (LE) – Normal Mode

High-Level Block Diagram of the Stratix III ALM

58 Altera Stratix III Adaptive Logic Modules (ALM) – Normal Mode

ECE 448 – FPGA and ASIC Design with VHDL FPGA Embedded Resources

ECE 448 – FPGA and ASIC Design with VHDL Embedded Multipliers

62 ECE 448 – FPGA and ASIC Design with VHDL Multipliers in Spartan 3 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. (

63 Number of Multipliers per Spartan 3 Device

64 Combinational and Registered Multiplier ECE 448 – FPGA and ASIC Design with VHDL

65 ECE 448 – FPGA and ASIC Design with VHDL Dedicated Multiplier Block

Cyclone II

Embedded Multiplier Block Overview Each Cyclone II has one to three columns of embedded multipliers. Each embedded multiplier can be configured to support  One 18 x 18 multiplier  Two 9 x 9 multipliers

Multiplier Block Architecture

Two Multiplier Types Two 9x9 multiplier 18x18 multiplier

Multiplier Stage Signals signa and signb are used to identify the signed and unsigned inputs.

71 3 Ways to Use Dedicated Hardware Three (3) ways to use dedicated (embedded) hardware –Inference –Instantiation –CoreGen in Xilinx MegaWizard Plug-In Manager in Altera

ECE 448 – FPGA and ASIC Design with VHDL DSP Units

73 Xilinx XtremeDSP Starting with Virtex 4 family, Xilinx introduced DSP48 block for high-speed DSP on FPGAs Essentially a multiply-accumulate core with many other features Now also in Spartan-3A, Spartan 6, Virtex 5, and Virtex 6

74 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Multiplier-Accumulator - MAC

75 Mathematical Functions DSP 48 can perform mathematical functions such as: Add/Subtract Accumulate Multiply Multiply-Accumulate Multiplexer Barrel Shifter Counter Divide (multi-cycle) Square Root (multi-cycle) Can also create filters such as: Serial FIR Filter (Xilinx calls this MACC filters) Parallel FIR Filter Semi-Parallel FIR Filter Multi-rate FIR Filters

76 DSP48 Slice: Virtex 4

77 Simplified Form of DSP48 Adder Out = (Z ± (X + Y + CIN))

78 Choosing Inputs to DSP Adder P = Adder Out = (Z ± (X + Y + CIN))

79 DSP48E Slice : Virtex5

80 New in Virtex 5 Compared to Virtex 4

81 Xilinx DSP48

Stratix III DSP Unit

ECE 448 – FPGA and ASIC Design with VHDL Embedded Memories

84 Memory Types Memory RAMROM Single portDual port With asynchronous read With synchronous read Memory

85 Memory Types in Xilinx Memory Distributed (MLUT-based) Block RAM-based (BRAM-based) InferredInstantiated Memory Manually Using Core Generator

86 Memory Types in Altera Memory Distributed (ALUT-based, Stratix III onwards) Memory block-based Inferred Instantiated Memory Manually Using MegaWizard Plug-In Manager Small size (512) Large size (144K, 512K) Medium size (4K, 9K, 20K)

The embedded memory structure consists of columns of M4K memory blocks that can be configured as RAM, first-in first-out (FIFO) buffers, and ROM Cyclone II Memory Blocks

Single-Port ROM The address lines of the ROM are registered The outputs can be registered or unregistered A.mif file is used to initialize the ROM contents

Stratix II TriMatrix Memory

Stratix III & Stratix IV TriMatrix Memory

Stratix II & III Shift-Register Memory Configuration

93ECE 448 – FPGA and ASIC Design with VHDL Supply Voltage

94ECE 448 – FPGA and ASIC Design with VHDL Change in Supply Voltages Year Technology (nm)Core Supply Voltage (V)

95ECE 448 – FPGA and ASIC Design with VHDL Gigabit Transceivers

96 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Using a Bus to Communicate Between Devices

97ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Using High-Speed Tranceivers to Communicate Between Devices

98 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Using High-Speed Tranceivers to Communicate Between Devices

99 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Effect of Noise on Single Wire and Differential Pair

100 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Generating a Differential Pair

101 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Multiple Standards for High-Speed Serial Communication Fibre Channel InfiniBand PCI Express (developed by Intel) RapidIO SkyRail (developed by MindSpeed Technologies) 10-gigabit Ethernet

102 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Using FPGA to Interface Between Multiple Standards

103 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Ganging Multiple Transceivers Together

104 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( An Ideal Signal vs. Signal Seen by Receiver

105 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( The Effects of Transmitting a Series of Identical Bits

106 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Main Elements of the Transceiver Block

107 Recovering Clock Signal

108 Sampling the Incoming Signal

109 Xilinx ML605 Evaluation Kit $1795

110 PLDA XpressV6 Design Kit $3990

111 HiTech Global PCI Express Gen 2 / SFP+ / USB 3.0 Development Board $2995

112 HiTech Global HXT 8-lane PCI Express/4- port SFP+ Optical Network Card $8995

113 HiTech Global HXT 16-lane PCI Express Optical Network Card $8995

114 Altera Stratix IV GX FPGA Development Kit $4495

115 PLDA XpressGX4LP Design Kit $4990

116 HiTech Global GT/GX PCI Express Gen 2 / 3 & Optical Development Platform/Networking Card $5995

117 Terasic DE4 Development and Education Board $2995

118 Gutz Logic PCI Express x1 Demo Board (Actel FPGA)

119 LatticeSC PCI Express x4 Evaluation

120 Board Overview ManufacturerNameFPGAMemoryApplicationPCIeThroughputBase Price Boards based on Xilinx Virtex-6 XilinxML605 Evaluation KitLX240T-12GB (max)General Purpose1.1 x8 / 2.0 x42 GB/s$1795 PLDAXpressV6 Design KitLX550T (max)8GB (max)General Purpose2.0 x84 GB/s$3990 HiTech GlobalPCI Express / USB 3.0LX550T (max)8GB (max)General Purpose2.0 x84 GB/s$2995 HiTech GlobalHXT 8-lane OpticalHX565T (max)16GB (max)High Speed Eth.2.0 x84 GB/s$8995 HiTech GlobalHXT 16-lane OpticalHX565T (max)16GB (max)High Speed Eth.2.0 x168 GB/s$8995 Boards based on Altera Stratix IV AlteraStratix IV GX KitGX530 (max)512MBGeneral Purpose2.0 x84 GB/s$4495 PLDAXpressGX4LPGX530 (max)2GB (max)High Speed Eth.2.0 x84 GB/s$4990 HiTech GlobalGT/GX PCIe & Optical100G5 (max)4GB (max)High Speed Eth.3.0 x88 GB/s$5995 TerasicDE4 BoardGX530 (max)8GB (max)General Purpose2.0 x84 GB/s$2995 Boards based on Actel ProASIC3 Gutz LogicPCI e x1 Demo BoardA3P10001MBPCIe Evaluation1.1 x1250 MB/sN/A Boards based on Lattice Semiconductor LatticeSC LatticeLatticeSC PCIe x4 BoardECP2M-5032MBPCIe Evaluation1.1 x41 GB/sN/A

121ECE 448 – FPGA and ASIC Design with VHDL Embedded Microprocessors

122ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN Copyright © 2004 Mentor Graphics Corp. ( Embedded Microprocessor Cores

123ECE 448 – FPGA and ASIC Design with VHDL Virtex-II Pro Architecture Features: 1.Processor Block 2.RocketIO Multi-Gigabit Transceivers 3.CLB and Configurable Logic 4.SelectIO-Ultra 5.Digital Clock Managers 6.Multipliers and Block SelectRAM

124ECE 448 – FPGA and ASIC Design with VHDL

125ECE 448 – FPGA and ASIC Design with VHDL PowerPC Cores PowerPC System

126ECE 448 – FPGA and ASIC Design with VHDL Embedded Development Kit (EDK) Processor IP, Microprocessor Peripheral Description Files System Constraint File PlatGen Data2MEM Download to FPGA Libraries Microprocessor Software Specification File Microprocessor Hardware Specification File Executable Linker C / C++ Code Compiler Bitstream VHDL / Verilog Hardware Flow ISE / Xflow Software Flow Synthesizer Object Files EDIF IP Netlists LibGen

127ECE 448 – FPGA and ASIC Design with VHDL Zynq - Extensible Processing Platform

128ECE 448 – FPGA and ASIC Design with VHDL Zynq – 7000 EPP

129ECE 448 – FPGA and ASIC Design with VHDL Zynq – 7000 Product Table

George Mason University Follow-up Courses

ECE Department MS in Electrical Engineering MS EE MS in Computer Engineering MS CpE COMMUNICATIONS & NETWORKING SIGNAL PROCESSING CONTROL & ROBOTICS MICROELECTRONICS/ NANOELECTRONICS SYSTEM DESIGN DIGITAL SYSTEMS DESIGN COMPUTER NETWORKS MICROPROCESSORS & EMBEDDED SYSTEMS NETWORK & SYSTEM SECURITY Programs Specializations BIOENGINEERING

DIGITAL SYSTEMS DESIGN 1.ECE 545 Digital System Design with VHDL (Fall) – K. Gaj, project, FPGA design with VHDL, Aldec/Synplicity/Xilinx/Altera 2. ECE 645 Computer Arithmetic (Spring) – K. Gaj, project, FPGA design with VHDL or Verilog, Aldec/Synplicity/Xilinx/Altera 3. ECE 586 Digital Integrated Circuits (Spring) – D. Ioannou 4. ECE 681 VLSI Design for ASICs (Fall) – H. Homayoun, project/lab, front-end and back-end ASIC design with Synopsys tools 5. ECE 682 VLSI Test Concepts (Spring) – T. Storey, homework 6. ECE 699 Digital Signal Processing Hardware Architectures (Fall) – A. Cohen, project, FPGA design with VHDL or Verilog

Possible New Graduate Computer Engineering Courses 5xx Digital System Design with Verilog 6xx Reconfigurable Computing (looking for instructors)

NETWORK AND SYSTEM SECURITY 1.ECE 542 Computer Network Architectures and Protocols (Fall, Spring) – S.-C. Chang, et al. 2.ECE 646 Cryptography and Computer Network Security (Fall) – K. Gaj, J-P. Kaps – lab, project: software/hardware/analytical 3.ECE 746 Advanced Applied Cryptography (every 2 nd Spring, 2013) – K. Gaj, J-P. Kaps – lab, project: software/hardware/analytical 4.ECE 699 Cryptographic Engineering (every 2 nd Spring, 2014) – J-P. Kaps – lectures + student/invited guests seminars 5.ISA 656 Network Security (Fall, Spring) – A. Stavrou

ECE 645 Computer Arithmetic Instructor: Dr. Kris Gaj

Advanced digital circuit design course covering addition and subtraction multiplication division and modular reduction exponentiation Efficient architectures for Integers unsigned and signed Real numbers fixed point single and double precision floating point Elements of the Galois field GF(2 n ) polynomial base

At the end of this course you should be able to: Understand mathematical and gate-level algorithms for computer addition, subtraction, multiplication, division, and exponentiation Understand tradeoffs involved with different arithmetic architectures between performance, area, latency, scalability, etc. Synthesize and implement computer arithmetic blocks on FPGAs Be comfortable with different number systems, and have familiarity with floating-point and Galois field arithmetic for future study Understand sources of error in computer arithmetic and basics of error analysis This knowledge will come about through homework, projects and practice exams. Course Objectives

1. Applications of computer arithmetic algorithms. Initial Discussion of Project Topics. INTRODUCTION Lecture topics

1.Basic addition, subtraction, and counting 2.Addition in Xilinx and Altera FPGAs 3. Carry-lookahead, carry-select, and hybrid adders 4. Adders based on Parallel Prefix Networks 5.Pipelined Adders 6.Modular addition and subtraction ADDITION AND SUBTRACTION

MULTIOPERAND ADDITION 1. Carry-save adders 2. Wallace and Dadda Trees 3. Adding multiple unsigned and signed numbers

Unsigned Integers Signed Integers Fixed-point real numbers Floating-point real numbers Elements of the Galois Field GF(2 n ) NUMBER REPRESENTATIONS

LONG INTEGER ARITHMETIC 1.Modular Exponentiation 2.Montgomery Multipliers and Exponentiation Units

MULTIPLICATION 1. Tree and array multipliers 2. Sequential multipliers 3. Multiplication of signed numbers and squaring 4. Multiplication in Xilinx and Altera FPGAs - using distributed logic - using embedded multipliers - using DSP blocks 5. Multiple clock systems

DIVISION 1.Basic restoring and non-restoring sequential dividers 2. SRT and high-radix dividers 3. Array dividers 4. Division by Convergence

FLOATING POINT AND GALOIS FIELD ARITHMETIC 1.Floating-point units 2. Galois Field GF(2 n ) units

ECE 682 VLSI Test Concepts Instructor: Dr. Tom Storey

Course Description Broad introduction to basic concepts, techniques, and tools of modern VLSI testing. Fundamentals of defect modeling, fault simulation, design for testability, built-in self-test techniques, and failure analysis. Test economics, physical defects and fault modeling, automated test pattern generation, fault simulation, design for test, built-in self test, memory test, PLD test, mixed-signal test, Iddq test, boundary scan and related standards, test synthesis, diagnosis and failure analysis, automated test equipment, embedded core test.

Course Text

Course Topics Introduction to Test Methods, Test Equipment, and the Economics of Test Fault and Defect Modeling Logic Test GenerationFault Simulation Memory TestDesign for Testability Advanced Testing MethodsFuture of VLSI Test

Course Changes New Text –Updated to reflect advances in state of the art –Covers a broader range of test topics –More engaging text, figures Course Content –Redone to reflect textbook change –Added developments since text was written –More emphasis on industry examples/war stories