Presentation is loading. Please wait.

Presentation is loading. Please wait.

George Mason University ATHENa - Automated Tool for Hardware EvaluatioN Modern FPGA Families ECE 545 Lecture 12.

Similar presentations


Presentation on theme: "George Mason University ATHENa - Automated Tool for Hardware EvaluatioN Modern FPGA Families ECE 545 Lecture 12."— Presentation transcript:

1 George Mason University ATHENa - Automated Tool for Hardware EvaluatioN Modern FPGA Families ECE 545 Lecture 12

2 George Mason University ATHENa

3 3 Resources ATHENa website http://cryptography.gmu.edu/athena

4 4 ATHENa – Automated Tool for Hardware EvaluatioN Supported in part by the National Institute of Standards & Technology (NIST)

5 ATHENa Team Venkata “Vinny” MS CpE student Ekawat “Ice” PhD CpE student Marcin PhD ECE student Rajesh PhD ECE student Michal PhD exchange student from Slovakia John MS CpE student

6 ATHENa – A utomated T ool for H ardware E valuatio N 6 Benchmarking open-source tool, written in Perl, aimed at an AUTOMATED generation of OPTIMIZED results for MULTIPLE hardware platforms Currently under development at George Mason University. http://cryptography.gmu.edu/athena

7 Why Athena? 7 "The Greek goddess Athena was frequently called upon to settle disputes between the gods or various mortals. Athena Goddess of Wisdom was known for her superb logic and intellect. Her decisions were usually well-considered, highly ethical, and seldom motivated by self-interest.” from "Athena, Greek Goddess of Wisdom and Craftsmanship"

8 ATHENa Server FPGA Synthesis and Implementation Result Summary + Database Entries 2 3 HDL + scripts + configuration files 1 Database Entries Download scripts and configuration files8 Designer 4 HDL + FPGA Tools User Database query Ranking of designs 5 6 Basic Dataflow of ATHENa 0 Interfaces + Testbenches 8

9 9 synthesizable source files configuration files testbench constraint files result summary (user-friendly) result summary (user-friendly) database entries (machine- friendly) database entries (machine- friendly)

10 ATHENa Major Features (1) synthesis, implementation, and timing analysis in batch mode support for devices and tools of multiple FPGA vendors: generation of results for multiple families of FPGAs of a given vendor automated choice of a best-matching device within a given family 10

11 ATHENa Major Features (2) automated verification of designs through simulation in batch mode support for multi-core processing automated extraction and tabulation of results several optimization strategies aimed at finding – optimum options of tools – best target clock frequency – best starting point of placement OR 11

12 12 batch mode of FPGA tools ease of extraction and tabulation of results Text Reports, Excel, CSV (Comma-Separated Values) optimized choice of tool options GMU_optimization_1 strategy Generation of Results Facilitated by ATHENa vs.

13 13 Relative Improvement of Results from Using ATHENa Virtex 5, 256-bit Variants of Hash Functions Ratios of results obtained using ATHENa suggested options vs. default options of FPGA tools

14 14 Other (Somewhat) Similar Tools ExploreAhead (part of PlanAhead) Design Space Explorer (DSE) Boldport Flow EDAx10 Cloud Platform

15 15 Distinguishing Features of ATHENa Support for multiple tools from multiple vendors Optimization strategies aimed at the best possible performance rather than design closure Extraction and presentation of results Seamless integration with the ATHENa database of results

16 Read the Tutorial! Install the Required Tools (see Tutorial - Part 1 – Tools Installation) Run ATHENa_setup How To Start Working With ATHENa? One-Time Tasks Download and unzip ATHENa http://cryptography.gmu.edu/athena/

17 Modify design.config.txt + possibly other configuration files Run ATHENa How To Start Working With ATHENa? Repetitive Tasks Prepare or modify your source files & source_list.txt

18 design.config.txt Your Design # directory containing synthesizable source files for the project SOURCE_DIR = # A file list containing list of files in the order suitable for synthesis and implementation # low level modules first, top level entity last SOURCE_LIST_FILE = source_list.txt # project name # it will be used in the names of result directories PROJECT_NAME = SHA256 # name of top level entity TOP_LEVEL_ENTITY = sha256 # name of top level architecture TOP_LEVEL_ARCH = rs_arch # name of clock net CLOCK_NET = clk

19 design.config.txt Timing Formulas #formula for latency LATENCY = TCLK*65 #formula for throughput THROUGHPUT = 512/(TCLK*65)

20 design.config.txt Application & Optimization Target # OPTIMIZATION_TARGET = speed | area | balanced OPTIMIZATION_TARGET = speed # OPTIONS = default | user OPTIONS = default # APPLICATION = single_run | exhaustive_search | placement_search | frequency_search | # GMU_Optimization_1 | GMU_Xilinx_optimization_1 APPLICATION = single_run # TRIM_MODE = off | zip | delete TRIM_MODE = zip

21 design.config.txt FPGA Families # commenting the next line removes all families of Xilinx FPGA_VENDOR = xilinx #commenting the next line removes a given family FPGA_FAMILY = spartan3 # FPGA_DEVICES = | best_match | all FPGA_DEVICES = best_match SYN_CONSTRAINT_FILE = default IMP_CONSTRAINT_FILE = default REQ_SYN_FREQ = 120 REQ_IMP_FREQ = 100 MAX_SLICE_UTILIZATION = 0.8 MAX_BRAM_UTILIZATION = 0.8 MAX_MUL_UTILIZATION = 1 MAX_PIN_UTILIZATION = 0.9 END FAMILY END VENDOR

22 design.config.txt FPGA Families # commenting the next line removes all families of Altera FPGA_VENDOR = altera #commenting the next line removes a given family FPGA_FAMILY = Stratix III # FPGA_DEVICES = | best_match | all FPGA_DEVICES = best_match SYN_CONSTRAINT_FILE = default IMP_CONSTRAINT_FILE = default REQ_IMP_FREQ = 120 MAX_LOGIC_UTILIZATION = 0.8 MAX_MEMORY_UTILIZATION = 0.8 MAX_DSP_UTILIZATION = 0 MAX_MUL_UTILIZATION = 0 MAX_PIN_UTILIZATION = 0.8 END FAMILY END VENDOR

23 Library Files device_lib/xilinx_device_lib.txt device_lib/altera_device_lib.txt Files created during ATHENa setup Characterize FPGA families and devices available in the version of Xilinx and Altera tools installed on your computer Currently supported tool versions: – Xilinx WebPACK 9.1, 9.2, 10.1, 11.1, 11.5, 12.1, 12.2, 12.3 – Xilinx Design Suite11.1, 12.1, 12.2, 12.3 – Altera Quartus II Web Edition8.1, 8.2, 9.0, 9.1, 10.0 – Altera Quartus II Subscription Edition9.1, 10.0 In case a library for a given version not available yet, use a library from the closest available version

24 Library Files device_lib/xilinx_device_lib.txt VENDOR = Xilinx #Device, Total Slices, Block RAMs, DSP, Dedicated Multipliers, Maximum User I/O Pins ITEM_ORDER = SLICE, BRAM, DSP, MULT, IO FAMILY = spartan3 xc3s50pq208-5, 768,4, 0, 4, 124 xc3s200ft256-5, 1920, 12, 0, 12, 173 xc3s400fg456-5, 3584, 16, 0, 16, 264 xc3s1000fg676-5, 7680, 24, 0, 24, 391 xc3s1500fg676-5, 13312, 32, 0, 32, 487 END_FAMILY FAMILY = virtex5 xc5vlx30ff676-3, 4800, 32, 32, 0, 400 xc5vfx30tff665-3, 5120, 68, 64, 0, 360 xc5vlx30tff665-3, 4800, 36, 32, 0, 360 xc5vlx50ff1153-3, 7200, 48, 48, 0, 560 xc5vlx50tff1136-3, 7200, 60, 48, 0, 480 END_FAMILY

25 Result Files report_resource_utilization.txt xilinx : spartan3 +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ | default | xc3s200ft256-5* | 1 | 142 | 3 | 74 | 3 | 4 | 33 | 7 | 58 | 0 | 0 | 20 | 11 | +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ xilinx : spartan6 +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ | default | xc6slx9csg324-3* | 1 | 41 | 1 | 22 | 1 | 4 | 6 | 0 | 0 | 9 | 56 | 20 | 10 | +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ xilinx : virtex5 +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ | default | xc5vlx20tff323-2* | 1 | 101 | 1 | 56 | 1 | 4 | 15 | 0 | 0 | 9 | 37 | 20 | 11 | +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ xilinx : virtex6 +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+ | default | xc6vlx75tff784-3* | 1 | 44 | 1 | 21 | 1 | 4 | 1 | 0 | 0 | 9 | 3 | 20 | 5 | +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+

26 Result Files report_timing.txt REQ SYN FREQ- Requested synthesis clk freq.SYN FREQ – Achieved synthesis clk. freq. REQ SYN TCLK- Requested synthesis clk periodSYN TCLK – Achieved synthesis clk. period REQ IMP FREQ- Requested implement. clk freq.IMP FREQ – Achieved implement. clk. freq. REQ IMP TCLK- Requested implement. clk periodIMP TCLK – Achieved implement clk. period LATENCY- Latency [ns]THROUGHPUT – Throughput [Mbits/s] TP/Area - Throughput/Area [(Mbits/s)/CLB slicesLatency*Area – Latency*Area [ns*CLB slices] xilinx : spartan3 +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc3s200ft256-5* | 1 | default | 207.370 | default | 4.822 | default | 112.448 | default | 8.893 | 17.786 | 449.792 | 6.078 | 1316.164 | +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : spartan6 +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc6slx9csg324-3* | 1 | default | 75.751 | default | 13.201 | default | 78.119 | default | 12.801 | 25.602 | 312.476 | 14.203 | 563.244 | +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc5vlx20tff323-2* | 1 | default | 156.347 | default | 6.396 | default | 126.952 | default | 7.877 | 15.754 | 507.808 | 9.068 | 882.224 | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc6vlx75tff784-3* | 1 | default | 158.053 | default | 6.327 | default | 135.410 | default | 7.385 | 14.770 | 541.638 | 25.792 | 310.170 | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+

27 Result Files report_options.txt xilinx : spartan3 +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ | default | xc3s200ft256-5* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b -cm speed | -w -ol std | +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ xilinx : spartan6 +---------+------------------+-----+------------+------------------------------+---------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+------------------+-----+------------+------------------------------+---------------+--------------+ | default | xc6slx9csg324-3* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b | -w -ol std | +---------+------------------+-----+------------+------------------------------+---------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ | default | xc5vlx20tff323-2* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b -cm speed | -w -ol std | +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+------------+------------------------------+---------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-------------------+-----+------------+------------------------------+---------------+--------------+ | default | xc6vlx75tff784-3* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b | -w -ol std | +---------+-------------------+-----+------------+------------------------------+---------------+--------------+ COST TABLE - parameter determining the starting point of placement Synthesis Options – options of the synthesis tool Map Options – Options of the mapping tool PAR Options – Options of the place & route tool

28 Result Files report_execution_time.txt xilinx : spartan3 +---------+-----------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-----------------+-----+----------------+---------------------+--------------+ | default | xc3s200ft256-5* | 1 | 0d 0h:0m:12s | 0d 0h:0m:36s | 0d 0h:0m:48s | +---------+-----------------+-----+----------------+---------------------+--------------+ xilinx : spartan6 +---------+------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+------------------+-----+----------------+---------------------+--------------+ | default | xc6slx9csg324-3* | 1 | 0d 0h:0m:21s | 0d 0h:1m:13s | 0d 0h:1m:34s | +---------+------------------+-----+----------------+---------------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-------------------+-----+----------------+---------------------+--------------+ | default | xc5vlx20tff323-2* | 1 | 0d 0h:0m:39s | 0d 0h:1m:50s | 0d 0h:2m:29s | +---------+-------------------+-----+----------------+---------------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-------------------+-----+----------------+---------------------+--------------+ | default | xc6vlx75tff784-3* | 1 | 0d 0h:0m:22s | 0d 0h:3m:22s | 0d 0h:3m:44s | +---------+-------------------+-----+----------------+---------------------+--------------+ Synthesis Time- Time of Synthesis Implementation Time- Time of Implementation Elapsed Time - Total Time

29 design.config.txt Functional Simulation (1) # FUNCTIONAL_VERFICATION_MODE = FUNCTIONAL_VERIFICATION_MODE = # directory containing source files of the testbench VERIFICATION_DIR = # A file containing a list of testbench files in the order suitable for compilation; # low level modules first, top level entity last. # Test vector files should be located in the same directory and listed # in the same file, unless fixed path is used. Please refer to tutorial for more detail. VERIFICATION_LIST_FILE = # name of testbench's top level entity TB_TOP_LEVEL_ENTITY = # name of testbench's top level architecture TB_TOP_LEVEL_ARCH =

30 design.config.txt Functional Simulation (2) # MAX_TIME_FUNCTIONAL_VERIFICATION = #supported unit are : ps, ns, us, and ms #if blank, simulation will run until it finishes = # = no changes in signals, i.e., clock is stopped and no more inputs coming in. MAX_TIME_FUNCTIONAL_VERIFICATION = <> # Perform only verification (synthesis and implementation parameters are ignored) # VERIFICATION_ONLY = VERIFICATION_ONLY =

31 31 ATHENa – Database of Results ATHENa – Database of Results

32 32 ATHENa Database http://cryptography.gmu.edu/athenadb

33 33 ATHENa Database – Result View Algorithm parameters Design parameters  Optimization target  Architecture type  Datapath width  I/O bus widths  Availability of source code  Platform  Vendor, Family, Device  Timing  Maximum clock frequency  Maximum throughput  Resource utilization  Logic blocks (Slices/LEs/ALUTs)  Multipliers/DSP units  Tools  Names & versions  Detailed options  Credits  Designers & contact information

34 34 ATHENa Database – Compare Feature Matching fields in grey Non-matching fields in red and blue

35 35 Possible Future Customizations The same basic database can be customized and adapted for other domains, such as Digital Signal Processing Bioinformatics Communications Scientific Computing, etc.

36 36 ATHENa - Website

37 37 ATHENa Website http://cryptography.gmu.edu/athena/ Download of ATHENa Tool Links to related tools SHA-3 Competition in FPGAs & ASICs Specifications of candidates Interface proposals RTL source codes Testbenches ATHENa database of results Related papers & presentations

38 38 First batch of GMU Source Codes for all Round 3 SHA-3 Candidates & SHA-2 made available at the ATHENa website at: http://cryprography.gmu.edu/athena Included in this release: Basic architectures Folded architectures Unrolled architectures Each code supports two variants: with 256-bit and 512-bit output. Each source code accompanied by comprehensive hierarchical block diagrams GMU Source Codes and Block Diagrams

39 39 ATHENa Result Replication Files Scripts and configuration files sufficient to easily reproduce all results (without repeating optimizations) Automatically created by ATHENa for all results generated using ATHENa Stored in the ATHENa Database In the same spirit of Reproducible Research as: Patrick Vandewalle 1, Jelena Kovacevic 2, and Martin Vetterli 1 ( 1 EPFL, 2 CMU) Reproducible research in signal processing - what, why, and how. IEEE Signal Processing Magazine, May 2009. http://rr.epfl.ch/17/ J. Claerbout (Stanford University) “Electronic documents give reproducible research a new meaning,” in Proc. 62nd Ann. Int. Meeting of the Soc. of Exploration Geophysics, 1992, http://sepwww.stanford.edu/doku.php?id=sep:research:reproducible:seg92.....

40 40 Benchmarking Goals Facilitated by ATHENa 1.cryptographic algorithms 2.hardware architectures or implementations of the same cryptographic algorithm 3.hardware platforms from the point of view of their suitability for the implementation of a given algorithm, (e.g., choice of an FPGA device or FPGA board) 4.tools and languages in terms of quality of results they generate (e.g. Verilog vs. VHDL, Synplicity Synplify Premier vs. Xilinx XST, ISE v. 13.1 vs. ISE v. 12.3) Comparing multiple:

41 George Mason University Modern FPGA Families

42 42ECE 448 – FPGA and ASIC Design with VHDL Resources Xcell Journal available for FREE on line @ http://www.xilinx.com/publications/xcellonline/ Electronic Engineering Journal available for FREE by e-mail after subscribing @ http://www.eejournal.com/subscribe http://www.eejournal.com/subscribe or on the web @ http://www.eejournal.com/design/fpga

43 43ECE 448 – FPGA and ASIC Design with VHDL FPGA Vendors & Families

44 44ECE 448 – FPGA and ASIC Design with VHDL Major FPGA Vendors SRAM-based FPGAs Xilinx, Inc. Altera Corp. Lattice Semiconductor Atmel Achronix Tabula Flash & antifuse FPGAs Actel Corp. (Microsemi SoC Products Group) Quick Logic Corp. ~ 51% of the market ~ 34% of the market ~ 85%

45 TechnologyLow-costHigh-performance 220 nmSpartan IIVirtex 120/150 nmVirtex II, II Pro 90 nmSpartan 3Virtex 4 65 nmVirtex 5 45 nmSpartan 6 40 nmVirtex 6 28 nmArtix 7Virtex 7 Xilinx FPGA Devices

46 Altera FPGA Devices TechnologyLow-costMid-rangeHigh- performance 130 nmCycloneStratix 90 nmCyclone IIStratix II 65 nmCyclone IIIArria IStratix III 40 nmCyclone IVArria IIStratix IV 28 nmCyclone VArria VStratix V

47 ECE 448 – FPGA and ASIC Design with VHDL LUTs & ALUTs

48 48ECE 448 – FPGA and ASIC Design with VHDL 4-bit LUTs vs. 6-bit LUTs 6-bit LUTs introduced in Virtex 5

49 49 Major Differences between Xilinx Families Number of CLB slices per CLB Number of LUTs per CLB slice Look-Up Tables Spartan 3 Virtex 4 Virtex 5, Virtex 6, Spartan 6 4-input6-input 4 2 2 4

50 50ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Xilinx Spartan CLB

51 Virtex 5 Arrangement of Slices within the CLB

52 ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Spartan 3 Multipurpose LUT (MLUT)

53 Virtex 5 64 x 1 Single Port RAM

54 54 Major Differences between Xilinx Families Maximum Shift Register Size per LUT Maximum Single-Port Memory Size per LUT Number of adder stages per CLB slice Spartan 3 Virtex 4 Virtex 5, Virtex 6, Spartan 6 16 x 164 x 1 16 bits 2 32 bits 4

55 Virtex 5 32-bit Shift Register, SRL

56 Altera Cyclone III Logic Element (LE) – Normal Mode

57 High-Level Block Diagram of the Stratix III ALM

58 58 Altera Stratix III Adaptive Logic Modules (ALM) – Normal Mode

59 ECE 448 – FPGA and ASIC Design with VHDL FPGA Embedded Resources

60 ECE 448 – FPGA and ASIC Design with VHDL Embedded Multipliers

61

62 62 ECE 448 – FPGA and ASIC Design with VHDL Multipliers in Spartan 3 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)

63 63 Number of Multipliers per Spartan 3 Device

64 64 Combinational and Registered Multiplier ECE 448 – FPGA and ASIC Design with VHDL

65 65 ECE 448 – FPGA and ASIC Design with VHDL Dedicated Multiplier Block

66 Cyclone II

67 Embedded Multiplier Block Overview Each Cyclone II has one to three columns of embedded multipliers. Each embedded multiplier can be configured to support  One 18 x 18 multiplier  Two 9 x 9 multipliers

68 Multiplier Block Architecture

69 Two Multiplier Types Two 9x9 multiplier 18x18 multiplier

70 Multiplier Stage Signals signa and signb are used to identify the signed and unsigned inputs.

71 71 3 Ways to Use Dedicated Hardware Three (3) ways to use dedicated (embedded) hardware –Inference –Instantiation –CoreGen in Xilinx MegaWizard Plug-In Manager in Altera

72 ECE 448 – FPGA and ASIC Design with VHDL DSP Units

73 73 Xilinx XtremeDSP Starting with Virtex 4 family, Xilinx introduced DSP48 block for high-speed DSP on FPGAs Essentially a multiply-accumulate core with many other features Now also in Spartan-3A, Spartan 6, Virtex 5, and Virtex 6

74 74 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Multiplier-Accumulator - MAC

75 75 Mathematical Functions DSP 48 can perform mathematical functions such as: Add/Subtract Accumulate Multiply Multiply-Accumulate Multiplexer Barrel Shifter Counter Divide (multi-cycle) Square Root (multi-cycle) Can also create filters such as: Serial FIR Filter (Xilinx calls this MACC filters) Parallel FIR Filter Semi-Parallel FIR Filter Multi-rate FIR Filters

76 76 DSP48 Slice: Virtex 4

77 77 Simplified Form of DSP48 Adder Out = (Z ± (X + Y + CIN))

78 78 Choosing Inputs to DSP Adder P = Adder Out = (Z ± (X + Y + CIN))

79 79 DSP48E Slice : Virtex5

80 80 New in Virtex 5 Compared to Virtex 4

81 81 Xilinx DSP48

82 Stratix III DSP Unit

83 ECE 448 – FPGA and ASIC Design with VHDL Embedded Memories

84 84 Memory Types Memory RAMROM Single portDual port With asynchronous read With synchronous read Memory

85 85 Memory Types in Xilinx Memory Distributed (MLUT-based) Block RAM-based (BRAM-based) InferredInstantiated Memory Manually Using Core Generator

86 86 Memory Types in Altera Memory Distributed (ALUT-based, Stratix III onwards) Memory block-based Inferred Instantiated Memory Manually Using MegaWizard Plug-In Manager Small size (512) Large size (144K, 512K) Medium size (4K, 9K, 20K)

87 The embedded memory structure consists of columns of M4K memory blocks that can be configured as RAM, first-in first-out (FIFO) buffers, and ROM Cyclone II Memory Blocks

88 Single-Port ROM The address lines of the ROM are registered The outputs can be registered or unregistered A.mif file is used to initialize the ROM contents

89 Stratix II TriMatrix Memory

90

91 Stratix III & Stratix IV TriMatrix Memory

92 Stratix II & III Shift-Register Memory Configuration

93 93ECE 448 – FPGA and ASIC Design with VHDL Supply Voltage

94 94ECE 448 – FPGA and ASIC Design with VHDL Change in Supply Voltages Year Technology (nm)Core Supply Voltage (V) 1998 3503.3 1999 250 2.5 2000 180 1.8 2001 150 1.5 2003 130 1.2 2008 65 1.0 2009 40 0.9 2011 28 0.9

95 95ECE 448 – FPGA and ASIC Design with VHDL Gigabit Transceivers

96 96 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Using a Bus to Communicate Between Devices

97 97ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Using High-Speed Tranceivers to Communicate Between Devices

98 98 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Using High-Speed Tranceivers to Communicate Between Devices

99 99 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Effect of Noise on Single Wire and Differential Pair

100 100 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Generating a Differential Pair

101 101 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Multiple Standards for High-Speed Serial Communication Fibre Channel InfiniBand PCI Express (developed by Intel) RapidIO SkyRail (developed by MindSpeed Technologies) 10-gigabit Ethernet

102 102 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Using FPGA to Interface Between Multiple Standards

103 103 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Ganging Multiple Transceivers Together

104 104 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) An Ideal Signal vs. Signal Seen by Receiver

105 105 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) The Effects of Transmitting a Series of Identical Bits

106 106 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Main Elements of the Transceiver Block

107 107 Recovering Clock Signal

108 108 Sampling the Incoming Signal

109 109 Xilinx ML605 Evaluation Kit $1795

110 110 PLDA XpressV6 Design Kit $3990

111 111 HiTech Global PCI Express Gen 2 / SFP+ / USB 3.0 Development Board $2995

112 112 HiTech Global HXT 8-lane PCI Express/4- port SFP+ Optical Network Card $8995

113 113 HiTech Global HXT 16-lane PCI Express Optical Network Card $8995

114 114 Altera Stratix IV GX FPGA Development Kit $4495

115 115 PLDA XpressGX4LP Design Kit $4990

116 116 HiTech Global GT/GX PCI Express Gen 2 / 3 & Optical Development Platform/Networking Card $5995

117 117 Terasic DE4 Development and Education Board $2995

118 118 Gutz Logic PCI Express x1 Demo Board (Actel FPGA)

119 119 LatticeSC PCI Express x4 Evaluation

120 120 Board Overview ManufacturerNameFPGAMemoryApplicationPCIeThroughputBase Price Boards based on Xilinx Virtex-6 XilinxML605 Evaluation KitLX240T-12GB (max)General Purpose1.1 x8 / 2.0 x42 GB/s$1795 PLDAXpressV6 Design KitLX550T (max)8GB (max)General Purpose2.0 x84 GB/s$3990 HiTech GlobalPCI Express / USB 3.0LX550T (max)8GB (max)General Purpose2.0 x84 GB/s$2995 HiTech GlobalHXT 8-lane OpticalHX565T (max)16GB (max)High Speed Eth.2.0 x84 GB/s$8995 HiTech GlobalHXT 16-lane OpticalHX565T (max)16GB (max)High Speed Eth.2.0 x168 GB/s$8995 Boards based on Altera Stratix IV AlteraStratix IV GX KitGX530 (max)512MBGeneral Purpose2.0 x84 GB/s$4495 PLDAXpressGX4LPGX530 (max)2GB (max)High Speed Eth.2.0 x84 GB/s$4990 HiTech GlobalGT/GX PCIe & Optical100G5 (max)4GB (max)High Speed Eth.3.0 x88 GB/s$5995 TerasicDE4 BoardGX530 (max)8GB (max)General Purpose2.0 x84 GB/s$2995 Boards based on Actel ProASIC3 Gutz LogicPCI e x1 Demo BoardA3P10001MBPCIe Evaluation1.1 x1250 MB/sN/A Boards based on Lattice Semiconductor LatticeSC LatticeLatticeSC PCIe x4 BoardECP2M-5032MBPCIe Evaluation1.1 x41 GB/sN/A

121 121ECE 448 – FPGA and ASIC Design with VHDL Embedded Microprocessors

122 122ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Embedded Microprocessor Cores

123 123ECE 448 – FPGA and ASIC Design with VHDL Virtex-II Pro Architecture24 6 1 5 3 Features: 1.Processor Block 2.RocketIO Multi-Gigabit Transceivers 3.CLB and Configurable Logic 4.SelectIO-Ultra 5.Digital Clock Managers 6.Multipliers and Block SelectRAM

124 124ECE 448 – FPGA and ASIC Design with VHDL

125 125ECE 448 – FPGA and ASIC Design with VHDL PowerPC Cores PowerPC System

126 126ECE 448 – FPGA and ASIC Design with VHDL Embedded Development Kit (EDK) Processor IP, Microprocessor Peripheral Description Files System Constraint File PlatGen Data2MEM Download to FPGA Libraries Microprocessor Software Specification File Microprocessor Hardware Specification File Executable Linker C / C++ Code Compiler Bitstream VHDL / Verilog Hardware Flow ISE / Xflow Software Flow Synthesizer Object Files EDIF IP Netlists LibGen

127 127ECE 448 – FPGA and ASIC Design with VHDL Zynq - Extensible Processing Platform

128 128ECE 448 – FPGA and ASIC Design with VHDL Zynq – 7000 EPP

129 129ECE 448 – FPGA and ASIC Design with VHDL Zynq – 7000 Product Table

130 George Mason University Follow-up Courses

131 ECE Department MS in Electrical Engineering MS EE MS in Computer Engineering MS CpE COMMUNICATIONS & NETWORKING SIGNAL PROCESSING CONTROL & ROBOTICS MICROELECTRONICS/ NANOELECTRONICS SYSTEM DESIGN DIGITAL SYSTEMS DESIGN COMPUTER NETWORKS MICROPROCESSORS & EMBEDDED SYSTEMS NETWORK & SYSTEM SECURITY Programs Specializations BIOENGINEERING

132 DIGITAL SYSTEMS DESIGN 1.ECE 545 Digital System Design with VHDL (Fall) – K. Gaj, project, FPGA design with VHDL, Aldec/Synplicity/Xilinx/Altera 2. ECE 645 Computer Arithmetic (Spring) – K. Gaj, project, FPGA design with VHDL or Verilog, Aldec/Synplicity/Xilinx/Altera 3. ECE 586 Digital Integrated Circuits (Spring) – D. Ioannou 4. ECE 681 VLSI Design for ASICs (Fall) – H. Homayoun, project/lab, front-end and back-end ASIC design with Synopsys tools 5. ECE 682 VLSI Test Concepts (Spring) – T. Storey, homework 6. ECE 699 Digital Signal Processing Hardware Architectures (Fall) – A. Cohen, project, FPGA design with VHDL or Verilog

133 Possible New Graduate Computer Engineering Courses 5xx Digital System Design with Verilog 6xx Reconfigurable Computing (looking for instructors)

134 NETWORK AND SYSTEM SECURITY 1.ECE 542 Computer Network Architectures and Protocols (Fall, Spring) – S.-C. Chang, et al. 2.ECE 646 Cryptography and Computer Network Security (Fall) – K. Gaj, J-P. Kaps – lab, project: software/hardware/analytical 3.ECE 746 Advanced Applied Cryptography (every 2 nd Spring, 2013) – K. Gaj, J-P. Kaps – lab, project: software/hardware/analytical 4.ECE 699 Cryptographic Engineering (every 2 nd Spring, 2014) – J-P. Kaps – lectures + student/invited guests seminars 5.ISA 656 Network Security (Fall, Spring) – A. Stavrou

135 ECE 645 Computer Arithmetic Instructor: Dr. Kris Gaj

136 Advanced digital circuit design course covering addition and subtraction multiplication division and modular reduction exponentiation Efficient architectures for Integers unsigned and signed Real numbers fixed point single and double precision floating point Elements of the Galois field GF(2 n ) polynomial base

137 At the end of this course you should be able to: Understand mathematical and gate-level algorithms for computer addition, subtraction, multiplication, division, and exponentiation Understand tradeoffs involved with different arithmetic architectures between performance, area, latency, scalability, etc. Synthesize and implement computer arithmetic blocks on FPGAs Be comfortable with different number systems, and have familiarity with floating-point and Galois field arithmetic for future study Understand sources of error in computer arithmetic and basics of error analysis This knowledge will come about through homework, projects and practice exams. Course Objectives

138 1. Applications of computer arithmetic algorithms. Initial Discussion of Project Topics. INTRODUCTION Lecture topics

139 1.Basic addition, subtraction, and counting 2.Addition in Xilinx and Altera FPGAs 3. Carry-lookahead, carry-select, and hybrid adders 4. Adders based on Parallel Prefix Networks 5.Pipelined Adders 6.Modular addition and subtraction ADDITION AND SUBTRACTION

140 MULTIOPERAND ADDITION 1. Carry-save adders 2. Wallace and Dadda Trees 3. Adding multiple unsigned and signed numbers

141 Unsigned Integers Signed Integers Fixed-point real numbers Floating-point real numbers Elements of the Galois Field GF(2 n ) NUMBER REPRESENTATIONS

142 LONG INTEGER ARITHMETIC 1.Modular Exponentiation 2.Montgomery Multipliers and Exponentiation Units

143 MULTIPLICATION 1. Tree and array multipliers 2. Sequential multipliers 3. Multiplication of signed numbers and squaring 4. Multiplication in Xilinx and Altera FPGAs - using distributed logic - using embedded multipliers - using DSP blocks 5. Multiple clock systems

144 DIVISION 1.Basic restoring and non-restoring sequential dividers 2. SRT and high-radix dividers 3. Array dividers 4. Division by Convergence

145 FLOATING POINT AND GALOIS FIELD ARITHMETIC 1.Floating-point units 2. Galois Field GF(2 n ) units

146 ECE 682 VLSI Test Concepts Instructor: Dr. Tom Storey

147 Course Description Broad introduction to basic concepts, techniques, and tools of modern VLSI testing. Fundamentals of defect modeling, fault simulation, design for testability, built-in self-test techniques, and failure analysis. Test economics, physical defects and fault modeling, automated test pattern generation, fault simulation, design for test, built-in self test, memory test, PLD test, mixed-signal test, Iddq test, boundary scan and related standards, test synthesis, diagnosis and failure analysis, automated test equipment, embedded core test.

148 Course Text

149 Course Topics Introduction to Test Methods, Test Equipment, and the Economics of Test Fault and Defect Modeling Logic Test GenerationFault Simulation Memory TestDesign for Testability Advanced Testing MethodsFuture of VLSI Test

150 Course Changes New Text –Updated to reflect advances in state of the art –Covers a broader range of test topics –More engaging text, figures Course Content –Redone to reflect textbook change –Added developments since text was written –More emphasis on industry examples/war stories


Download ppt "George Mason University ATHENa - Automated Tool for Hardware EvaluatioN Modern FPGA Families ECE 545 Lecture 12."

Similar presentations


Ads by Google