Download presentation
Presentation is loading. Please wait.
Published byLilian Golden Modified over 8 years ago
1
George Mason University ATHENa - Automated Tool for Hardware EvaluatioN Modern FPGA Families ECE 545 Lecture 12
2
George Mason University ATHENa
3
3 Resources ATHENa website http://cryptography.gmu.edu/athena
4
4 ATHENa – Automated Tool for Hardware EvaluatioN Supported in part by the National Institute of Standards & Technology (NIST)
5
ATHENa Team Venkata “Vinny” MS CpE student Ekawat “Ice” PhD CpE student Marcin PhD ECE student Rajesh PhD ECE student Michal PhD exchange student from Slovakia John MS CpE student
6
ATHENa – A utomated T ool for H ardware E valuatio N 6 Benchmarking open-source tool, written in Perl, aimed at an AUTOMATED generation of OPTIMIZED results for MULTIPLE hardware platforms Currently under development at George Mason University. http://cryptography.gmu.edu/athena
7
Why Athena? 7 "The Greek goddess Athena was frequently called upon to settle disputes between the gods or various mortals. Athena Goddess of Wisdom was known for her superb logic and intellect. Her decisions were usually well-considered, highly ethical, and seldom motivated by self-interest.” from "Athena, Greek Goddess of Wisdom and Craftsmanship"
8
ATHENa Server FPGA Synthesis and Implementation Result Summary + Database Entries 2 3 HDL + scripts + configuration files 1 Database Entries Download scripts and configuration files8 Designer 4 HDL + FPGA Tools User Database query Ranking of designs 5 6 Basic Dataflow of ATHENa 0 Interfaces + Testbenches 8
9
9 synthesizable source files configuration files testbench constraint files result summary (user-friendly) result summary (user-friendly) database entries (machine- friendly) database entries (machine- friendly)
10
ATHENa Major Features (1) synthesis, implementation, and timing analysis in batch mode support for devices and tools of multiple FPGA vendors: generation of results for multiple families of FPGAs of a given vendor automated choice of a best-matching device within a given family 10
11
ATHENa Major Features (2) automated verification of designs through simulation in batch mode support for multi-core processing automated extraction and tabulation of results several optimization strategies aimed at finding – optimum options of tools – best target clock frequency – best starting point of placement OR 11
12
12 batch mode of FPGA tools ease of extraction and tabulation of results Text Reports, Excel, CSV (Comma-Separated Values) optimized choice of tool options GMU_optimization_1 strategy Generation of Results Facilitated by ATHENa vs.
13
13 Relative Improvement of Results from Using ATHENa Virtex 5, 256-bit Variants of Hash Functions Ratios of results obtained using ATHENa suggested options vs. default options of FPGA tools
14
14 Other (Somewhat) Similar Tools ExploreAhead (part of PlanAhead) Design Space Explorer (DSE) Boldport Flow EDAx10 Cloud Platform
15
15 Distinguishing Features of ATHENa Support for multiple tools from multiple vendors Optimization strategies aimed at the best possible performance rather than design closure Extraction and presentation of results Seamless integration with the ATHENa database of results
16
Read the Tutorial! Install the Required Tools (see Tutorial - Part 1 – Tools Installation) Run ATHENa_setup How To Start Working With ATHENa? One-Time Tasks Download and unzip ATHENa http://cryptography.gmu.edu/athena/
17
Modify design.config.txt + possibly other configuration files Run ATHENa How To Start Working With ATHENa? Repetitive Tasks Prepare or modify your source files & source_list.txt
18
design.config.txt Your Design # directory containing synthesizable source files for the project SOURCE_DIR = # A file list containing list of files in the order suitable for synthesis and implementation # low level modules first, top level entity last SOURCE_LIST_FILE = source_list.txt # project name # it will be used in the names of result directories PROJECT_NAME = SHA256 # name of top level entity TOP_LEVEL_ENTITY = sha256 # name of top level architecture TOP_LEVEL_ARCH = rs_arch # name of clock net CLOCK_NET = clk
19
design.config.txt Timing Formulas #formula for latency LATENCY = TCLK*65 #formula for throughput THROUGHPUT = 512/(TCLK*65)
20
design.config.txt Application & Optimization Target # OPTIMIZATION_TARGET = speed | area | balanced OPTIMIZATION_TARGET = speed # OPTIONS = default | user OPTIONS = default # APPLICATION = single_run | exhaustive_search | placement_search | frequency_search | # GMU_Optimization_1 | GMU_Xilinx_optimization_1 APPLICATION = single_run # TRIM_MODE = off | zip | delete TRIM_MODE = zip
21
design.config.txt FPGA Families # commenting the next line removes all families of Xilinx FPGA_VENDOR = xilinx #commenting the next line removes a given family FPGA_FAMILY = spartan3 # FPGA_DEVICES = | best_match | all FPGA_DEVICES = best_match SYN_CONSTRAINT_FILE = default IMP_CONSTRAINT_FILE = default REQ_SYN_FREQ = 120 REQ_IMP_FREQ = 100 MAX_SLICE_UTILIZATION = 0.8 MAX_BRAM_UTILIZATION = 0.8 MAX_MUL_UTILIZATION = 1 MAX_PIN_UTILIZATION = 0.9 END FAMILY END VENDOR
22
design.config.txt FPGA Families # commenting the next line removes all families of Altera FPGA_VENDOR = altera #commenting the next line removes a given family FPGA_FAMILY = Stratix III # FPGA_DEVICES = | best_match | all FPGA_DEVICES = best_match SYN_CONSTRAINT_FILE = default IMP_CONSTRAINT_FILE = default REQ_IMP_FREQ = 120 MAX_LOGIC_UTILIZATION = 0.8 MAX_MEMORY_UTILIZATION = 0.8 MAX_DSP_UTILIZATION = 0 MAX_MUL_UTILIZATION = 0 MAX_PIN_UTILIZATION = 0.8 END FAMILY END VENDOR
23
Library Files device_lib/xilinx_device_lib.txt device_lib/altera_device_lib.txt Files created during ATHENa setup Characterize FPGA families and devices available in the version of Xilinx and Altera tools installed on your computer Currently supported tool versions: – Xilinx WebPACK 9.1, 9.2, 10.1, 11.1, 11.5, 12.1, 12.2, 12.3 – Xilinx Design Suite11.1, 12.1, 12.2, 12.3 – Altera Quartus II Web Edition8.1, 8.2, 9.0, 9.1, 10.0 – Altera Quartus II Subscription Edition9.1, 10.0 In case a library for a given version not available yet, use a library from the closest available version
24
Library Files device_lib/xilinx_device_lib.txt VENDOR = Xilinx #Device, Total Slices, Block RAMs, DSP, Dedicated Multipliers, Maximum User I/O Pins ITEM_ORDER = SLICE, BRAM, DSP, MULT, IO FAMILY = spartan3 xc3s50pq208-5, 768,4, 0, 4, 124 xc3s200ft256-5, 1920, 12, 0, 12, 173 xc3s400fg456-5, 3584, 16, 0, 16, 264 xc3s1000fg676-5, 7680, 24, 0, 24, 391 xc3s1500fg676-5, 13312, 32, 0, 32, 487 END_FAMILY FAMILY = virtex5 xc5vlx30ff676-3, 4800, 32, 32, 0, 400 xc5vfx30tff665-3, 5120, 68, 64, 0, 360 xc5vlx30tff665-3, 4800, 36, 32, 0, 360 xc5vlx50ff1153-3, 7200, 48, 48, 0, 560 xc5vlx50tff1136-3, 7200, 60, 48, 0, 480 END_FAMILY
25
Result Files report_resource_utilization.txt xilinx : spartan3 +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ | default | xc3s200ft256-5* | 1 | 142 | 3 | 74 | 3 | 4 | 33 | 7 | 58 | 0 | 0 | 20 | 11 | +---------+-----------------+-----+------+---+--------+---+-------+----+-------+----+------+---+----+----+ xilinx : spartan6 +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ | default | xc6slx9csg324-3* | 1 | 41 | 1 | 22 | 1 | 4 | 6 | 0 | 0 | 9 | 56 | 20 | 10 | +---------+------------------+-----+------+---+--------+---+-------+---+-------+---+------+----+----+----+ xilinx : virtex5 +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ | default | xc5vlx20tff323-2* | 1 | 101 | 1 | 56 | 1 | 4 | 15 | 0 | 0 | 9 | 37 | 20 | 11 | +---------+-------------------+-----+------+---+--------+---+-------+----+-------+---+------+----+----+----+ xilinx : virtex6 +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+ | GENERIC | DEVICE | RUN | LUTs | % | SLICES | % | BRAMs | % | MULTs | % | DSPs | % | IO | % | +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+ | default | xc6vlx75tff784-3* | 1 | 44 | 1 | 21 | 1 | 4 | 1 | 0 | 0 | 9 | 3 | 20 | 5 | +---------+-------------------+-----+------+---+--------+---+-------+---+-------+---+------+---+----+---+
26
Result Files report_timing.txt REQ SYN FREQ- Requested synthesis clk freq.SYN FREQ – Achieved synthesis clk. freq. REQ SYN TCLK- Requested synthesis clk periodSYN TCLK – Achieved synthesis clk. period REQ IMP FREQ- Requested implement. clk freq.IMP FREQ – Achieved implement. clk. freq. REQ IMP TCLK- Requested implement. clk periodIMP TCLK – Achieved implement clk. period LATENCY- Latency [ns]THROUGHPUT – Throughput [Mbits/s] TP/Area - Throughput/Area [(Mbits/s)/CLB slicesLatency*Area – Latency*Area [ns*CLB slices] xilinx : spartan3 +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc3s200ft256-5* | 1 | default | 207.370 | default | 4.822 | default | 112.448 | default | 8.893 | 17.786 | 449.792 | 6.078 | 1316.164 | +---------+-----------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : spartan6 +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc6slx9csg324-3* | 1 | default | 75.751 | default | 13.201 | default | 78.119 | default | 12.801 | 25.602 | 312.476 | 14.203 | 563.244 | +---------+------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc5vlx20tff323-2* | 1 | default | 156.347 | default | 6.396 | default | 126.952 | default | 7.877 | 15.754 | 507.808 | 9.068 | 882.224 | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | GENERIC | DEVICE | RUN | REQ SYN FREQ | SYN FREQ | REQ SYN TCLK | SYN TCLK | REQ IMP FREQ | IMP FREQ | REQ IMP TCLK | IMP TCLK | LATENCY | THROUGHPUT | TP/Area | Latency*Area | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+ | default | xc6vlx75tff784-3* | 1 | default | 158.053 | default | 6.327 | default | 135.410 | default | 7.385 | 14.770 | 541.638 | 25.792 | 310.170 | +---------+-------------------+-----+--------------+----------+--------------+----------+--------------+----------+--------------+----------+---------+------------+------------+--------------+
27
Result Files report_options.txt xilinx : spartan3 +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ | default | xc3s200ft256-5* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b -cm speed | -w -ol std | +---------+-----------------+-----+------------+------------------------------+-------------------------+--------------+ xilinx : spartan6 +---------+------------------+-----+------------+------------------------------+---------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+------------------+-----+------------+------------------------------+---------------+--------------+ | default | xc6slx9csg324-3* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b | -w -ol std | +---------+------------------+-----+------------+------------------------------+---------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ | default | xc5vlx20tff323-2* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b -cm speed | -w -ol std | +---------+-------------------+-----+------------+------------------------------+-------------------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+------------+------------------------------+---------------+--------------+ | GENERIC | DEVICE | RUN | COST TABLE | Synthesis Options | Map Options | PAR Options | +---------+-------------------+-----+------------+------------------------------+---------------+--------------+ | default | xc6vlx75tff784-3* | 1 | 1 | -opt_level 1 -opt_mode speed | -c 100 -pr b | -w -ol std | +---------+-------------------+-----+------------+------------------------------+---------------+--------------+ COST TABLE - parameter determining the starting point of placement Synthesis Options – options of the synthesis tool Map Options – Options of the mapping tool PAR Options – Options of the place & route tool
28
Result Files report_execution_time.txt xilinx : spartan3 +---------+-----------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-----------------+-----+----------------+---------------------+--------------+ | default | xc3s200ft256-5* | 1 | 0d 0h:0m:12s | 0d 0h:0m:36s | 0d 0h:0m:48s | +---------+-----------------+-----+----------------+---------------------+--------------+ xilinx : spartan6 +---------+------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+------------------+-----+----------------+---------------------+--------------+ | default | xc6slx9csg324-3* | 1 | 0d 0h:0m:21s | 0d 0h:1m:13s | 0d 0h:1m:34s | +---------+------------------+-----+----------------+---------------------+--------------+ xilinx : virtex5 +---------+-------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-------------------+-----+----------------+---------------------+--------------+ | default | xc5vlx20tff323-2* | 1 | 0d 0h:0m:39s | 0d 0h:1m:50s | 0d 0h:2m:29s | +---------+-------------------+-----+----------------+---------------------+--------------+ xilinx : virtex6 +---------+-------------------+-----+----------------+---------------------+--------------+ | GENERIC | DEVICE | RUN | Synthesis Time | Implementation Time | Elapsed Time | +---------+-------------------+-----+----------------+---------------------+--------------+ | default | xc6vlx75tff784-3* | 1 | 0d 0h:0m:22s | 0d 0h:3m:22s | 0d 0h:3m:44s | +---------+-------------------+-----+----------------+---------------------+--------------+ Synthesis Time- Time of Synthesis Implementation Time- Time of Implementation Elapsed Time - Total Time
29
design.config.txt Functional Simulation (1) # FUNCTIONAL_VERFICATION_MODE = FUNCTIONAL_VERIFICATION_MODE = # directory containing source files of the testbench VERIFICATION_DIR = # A file containing a list of testbench files in the order suitable for compilation; # low level modules first, top level entity last. # Test vector files should be located in the same directory and listed # in the same file, unless fixed path is used. Please refer to tutorial for more detail. VERIFICATION_LIST_FILE = # name of testbench's top level entity TB_TOP_LEVEL_ENTITY = # name of testbench's top level architecture TB_TOP_LEVEL_ARCH =
30
design.config.txt Functional Simulation (2) # MAX_TIME_FUNCTIONAL_VERIFICATION = #supported unit are : ps, ns, us, and ms #if blank, simulation will run until it finishes = # = no changes in signals, i.e., clock is stopped and no more inputs coming in. MAX_TIME_FUNCTIONAL_VERIFICATION = <> # Perform only verification (synthesis and implementation parameters are ignored) # VERIFICATION_ONLY = VERIFICATION_ONLY =
31
31 ATHENa – Database of Results ATHENa – Database of Results
32
32 ATHENa Database http://cryptography.gmu.edu/athenadb
33
33 ATHENa Database – Result View Algorithm parameters Design parameters Optimization target Architecture type Datapath width I/O bus widths Availability of source code Platform Vendor, Family, Device Timing Maximum clock frequency Maximum throughput Resource utilization Logic blocks (Slices/LEs/ALUTs) Multipliers/DSP units Tools Names & versions Detailed options Credits Designers & contact information
34
34 ATHENa Database – Compare Feature Matching fields in grey Non-matching fields in red and blue
35
35 Possible Future Customizations The same basic database can be customized and adapted for other domains, such as Digital Signal Processing Bioinformatics Communications Scientific Computing, etc.
36
36 ATHENa - Website
37
37 ATHENa Website http://cryptography.gmu.edu/athena/ Download of ATHENa Tool Links to related tools SHA-3 Competition in FPGAs & ASICs Specifications of candidates Interface proposals RTL source codes Testbenches ATHENa database of results Related papers & presentations
38
38 First batch of GMU Source Codes for all Round 3 SHA-3 Candidates & SHA-2 made available at the ATHENa website at: http://cryprography.gmu.edu/athena Included in this release: Basic architectures Folded architectures Unrolled architectures Each code supports two variants: with 256-bit and 512-bit output. Each source code accompanied by comprehensive hierarchical block diagrams GMU Source Codes and Block Diagrams
39
39 ATHENa Result Replication Files Scripts and configuration files sufficient to easily reproduce all results (without repeating optimizations) Automatically created by ATHENa for all results generated using ATHENa Stored in the ATHENa Database In the same spirit of Reproducible Research as: Patrick Vandewalle 1, Jelena Kovacevic 2, and Martin Vetterli 1 ( 1 EPFL, 2 CMU) Reproducible research in signal processing - what, why, and how. IEEE Signal Processing Magazine, May 2009. http://rr.epfl.ch/17/ J. Claerbout (Stanford University) “Electronic documents give reproducible research a new meaning,” in Proc. 62nd Ann. Int. Meeting of the Soc. of Exploration Geophysics, 1992, http://sepwww.stanford.edu/doku.php?id=sep:research:reproducible:seg92.....
40
40 Benchmarking Goals Facilitated by ATHENa 1.cryptographic algorithms 2.hardware architectures or implementations of the same cryptographic algorithm 3.hardware platforms from the point of view of their suitability for the implementation of a given algorithm, (e.g., choice of an FPGA device or FPGA board) 4.tools and languages in terms of quality of results they generate (e.g. Verilog vs. VHDL, Synplicity Synplify Premier vs. Xilinx XST, ISE v. 13.1 vs. ISE v. 12.3) Comparing multiple:
41
George Mason University Modern FPGA Families
42
42ECE 448 – FPGA and ASIC Design with VHDL Resources Xcell Journal available for FREE on line @ http://www.xilinx.com/publications/xcellonline/ Electronic Engineering Journal available for FREE by e-mail after subscribing @ http://www.eejournal.com/subscribe http://www.eejournal.com/subscribe or on the web @ http://www.eejournal.com/design/fpga
43
43ECE 448 – FPGA and ASIC Design with VHDL FPGA Vendors & Families
44
44ECE 448 – FPGA and ASIC Design with VHDL Major FPGA Vendors SRAM-based FPGAs Xilinx, Inc. Altera Corp. Lattice Semiconductor Atmel Achronix Tabula Flash & antifuse FPGAs Actel Corp. (Microsemi SoC Products Group) Quick Logic Corp. ~ 51% of the market ~ 34% of the market ~ 85%
45
TechnologyLow-costHigh-performance 220 nmSpartan IIVirtex 120/150 nmVirtex II, II Pro 90 nmSpartan 3Virtex 4 65 nmVirtex 5 45 nmSpartan 6 40 nmVirtex 6 28 nmArtix 7Virtex 7 Xilinx FPGA Devices
46
Altera FPGA Devices TechnologyLow-costMid-rangeHigh- performance 130 nmCycloneStratix 90 nmCyclone IIStratix II 65 nmCyclone IIIArria IStratix III 40 nmCyclone IVArria IIStratix IV 28 nmCyclone VArria VStratix V
47
ECE 448 – FPGA and ASIC Design with VHDL LUTs & ALUTs
48
48ECE 448 – FPGA and ASIC Design with VHDL 4-bit LUTs vs. 6-bit LUTs 6-bit LUTs introduced in Virtex 5
49
49 Major Differences between Xilinx Families Number of CLB slices per CLB Number of LUTs per CLB slice Look-Up Tables Spartan 3 Virtex 4 Virtex 5, Virtex 6, Spartan 6 4-input6-input 4 2 2 4
50
50ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Xilinx Spartan CLB
51
Virtex 5 Arrangement of Slices within the CLB
52
ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Spartan 3 Multipurpose LUT (MLUT)
53
Virtex 5 64 x 1 Single Port RAM
54
54 Major Differences between Xilinx Families Maximum Shift Register Size per LUT Maximum Single-Port Memory Size per LUT Number of adder stages per CLB slice Spartan 3 Virtex 4 Virtex 5, Virtex 6, Spartan 6 16 x 164 x 1 16 bits 2 32 bits 4
55
Virtex 5 32-bit Shift Register, SRL
56
Altera Cyclone III Logic Element (LE) – Normal Mode
57
High-Level Block Diagram of the Stratix III ALM
58
58 Altera Stratix III Adaptive Logic Modules (ALM) – Normal Mode
59
ECE 448 – FPGA and ASIC Design with VHDL FPGA Embedded Resources
60
ECE 448 – FPGA and ASIC Design with VHDL Embedded Multipliers
62
62 ECE 448 – FPGA and ASIC Design with VHDL Multipliers in Spartan 3 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
63
63 Number of Multipliers per Spartan 3 Device
64
64 Combinational and Registered Multiplier ECE 448 – FPGA and ASIC Design with VHDL
65
65 ECE 448 – FPGA and ASIC Design with VHDL Dedicated Multiplier Block
66
Cyclone II
67
Embedded Multiplier Block Overview Each Cyclone II has one to three columns of embedded multipliers. Each embedded multiplier can be configured to support One 18 x 18 multiplier Two 9 x 9 multipliers
68
Multiplier Block Architecture
69
Two Multiplier Types Two 9x9 multiplier 18x18 multiplier
70
Multiplier Stage Signals signa and signb are used to identify the signed and unsigned inputs.
71
71 3 Ways to Use Dedicated Hardware Three (3) ways to use dedicated (embedded) hardware –Inference –Instantiation –CoreGen in Xilinx MegaWizard Plug-In Manager in Altera
72
ECE 448 – FPGA and ASIC Design with VHDL DSP Units
73
73 Xilinx XtremeDSP Starting with Virtex 4 family, Xilinx introduced DSP48 block for high-speed DSP on FPGAs Essentially a multiply-accumulate core with many other features Now also in Spartan-3A, Spartan 6, Virtex 5, and Virtex 6
74
74 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Multiplier-Accumulator - MAC
75
75 Mathematical Functions DSP 48 can perform mathematical functions such as: Add/Subtract Accumulate Multiply Multiply-Accumulate Multiplexer Barrel Shifter Counter Divide (multi-cycle) Square Root (multi-cycle) Can also create filters such as: Serial FIR Filter (Xilinx calls this MACC filters) Parallel FIR Filter Semi-Parallel FIR Filter Multi-rate FIR Filters
76
76 DSP48 Slice: Virtex 4
77
77 Simplified Form of DSP48 Adder Out = (Z ± (X + Y + CIN))
78
78 Choosing Inputs to DSP Adder P = Adder Out = (Z ± (X + Y + CIN))
79
79 DSP48E Slice : Virtex5
80
80 New in Virtex 5 Compared to Virtex 4
81
81 Xilinx DSP48
82
Stratix III DSP Unit
83
ECE 448 – FPGA and ASIC Design with VHDL Embedded Memories
84
84 Memory Types Memory RAMROM Single portDual port With asynchronous read With synchronous read Memory
85
85 Memory Types in Xilinx Memory Distributed (MLUT-based) Block RAM-based (BRAM-based) InferredInstantiated Memory Manually Using Core Generator
86
86 Memory Types in Altera Memory Distributed (ALUT-based, Stratix III onwards) Memory block-based Inferred Instantiated Memory Manually Using MegaWizard Plug-In Manager Small size (512) Large size (144K, 512K) Medium size (4K, 9K, 20K)
87
The embedded memory structure consists of columns of M4K memory blocks that can be configured as RAM, first-in first-out (FIFO) buffers, and ROM Cyclone II Memory Blocks
88
Single-Port ROM The address lines of the ROM are registered The outputs can be registered or unregistered A.mif file is used to initialize the ROM contents
89
Stratix II TriMatrix Memory
91
Stratix III & Stratix IV TriMatrix Memory
92
Stratix II & III Shift-Register Memory Configuration
93
93ECE 448 – FPGA and ASIC Design with VHDL Supply Voltage
94
94ECE 448 – FPGA and ASIC Design with VHDL Change in Supply Voltages Year Technology (nm)Core Supply Voltage (V) 1998 3503.3 1999 250 2.5 2000 180 1.8 2001 150 1.5 2003 130 1.2 2008 65 1.0 2009 40 0.9 2011 28 0.9
95
95ECE 448 – FPGA and ASIC Design with VHDL Gigabit Transceivers
96
96 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Using a Bus to Communicate Between Devices
97
97ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Using High-Speed Tranceivers to Communicate Between Devices
98
98 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Using High-Speed Tranceivers to Communicate Between Devices
99
99 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Effect of Noise on Single Wire and Differential Pair
100
100 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Generating a Differential Pair
101
101 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Multiple Standards for High-Speed Serial Communication Fibre Channel InfiniBand PCI Express (developed by Intel) RapidIO SkyRail (developed by MindSpeed Technologies) 10-gigabit Ethernet
102
102 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Using FPGA to Interface Between Multiple Standards
103
103 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Ganging Multiple Transceivers Together
104
104 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) An Ideal Signal vs. Signal Seen by Receiver
105
105 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) The Effects of Transmitting a Series of Identical Bits
106
106 The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Main Elements of the Transceiver Block
107
107 Recovering Clock Signal
108
108 Sampling the Incoming Signal
109
109 Xilinx ML605 Evaluation Kit $1795
110
110 PLDA XpressV6 Design Kit $3990
111
111 HiTech Global PCI Express Gen 2 / SFP+ / USB 3.0 Development Board $2995
112
112 HiTech Global HXT 8-lane PCI Express/4- port SFP+ Optical Network Card $8995
113
113 HiTech Global HXT 16-lane PCI Express Optical Network Card $8995
114
114 Altera Stratix IV GX FPGA Development Kit $4495
115
115 PLDA XpressGX4LP Design Kit $4990
116
116 HiTech Global GT/GX PCI Express Gen 2 / 3 & Optical Development Platform/Networking Card $5995
117
117 Terasic DE4 Development and Education Board $2995
118
118 Gutz Logic PCI Express x1 Demo Board (Actel FPGA)
119
119 LatticeSC PCI Express x4 Evaluation
120
120 Board Overview ManufacturerNameFPGAMemoryApplicationPCIeThroughputBase Price Boards based on Xilinx Virtex-6 XilinxML605 Evaluation KitLX240T-12GB (max)General Purpose1.1 x8 / 2.0 x42 GB/s$1795 PLDAXpressV6 Design KitLX550T (max)8GB (max)General Purpose2.0 x84 GB/s$3990 HiTech GlobalPCI Express / USB 3.0LX550T (max)8GB (max)General Purpose2.0 x84 GB/s$2995 HiTech GlobalHXT 8-lane OpticalHX565T (max)16GB (max)High Speed Eth.2.0 x84 GB/s$8995 HiTech GlobalHXT 16-lane OpticalHX565T (max)16GB (max)High Speed Eth.2.0 x168 GB/s$8995 Boards based on Altera Stratix IV AlteraStratix IV GX KitGX530 (max)512MBGeneral Purpose2.0 x84 GB/s$4495 PLDAXpressGX4LPGX530 (max)2GB (max)High Speed Eth.2.0 x84 GB/s$4990 HiTech GlobalGT/GX PCIe & Optical100G5 (max)4GB (max)High Speed Eth.3.0 x88 GB/s$5995 TerasicDE4 BoardGX530 (max)8GB (max)General Purpose2.0 x84 GB/s$2995 Boards based on Actel ProASIC3 Gutz LogicPCI e x1 Demo BoardA3P10001MBPCIe Evaluation1.1 x1250 MB/sN/A Boards based on Lattice Semiconductor LatticeSC LatticeLatticeSC PCIe x4 BoardECP2M-5032MBPCIe Evaluation1.1 x41 GB/sN/A
121
121ECE 448 – FPGA and ASIC Design with VHDL Embedded Microprocessors
122
122ECE 448 – FPGA and ASIC Design with VHDL The Design Warrior’s Guide to FPGAs Devices, Tools, and Flows. ISBN 0750676043 Copyright © 2004 Mentor Graphics Corp. (www.mentor.com) Embedded Microprocessor Cores
123
123ECE 448 – FPGA and ASIC Design with VHDL Virtex-II Pro Architecture24 6 1 5 3 Features: 1.Processor Block 2.RocketIO Multi-Gigabit Transceivers 3.CLB and Configurable Logic 4.SelectIO-Ultra 5.Digital Clock Managers 6.Multipliers and Block SelectRAM
124
124ECE 448 – FPGA and ASIC Design with VHDL
125
125ECE 448 – FPGA and ASIC Design with VHDL PowerPC Cores PowerPC System
126
126ECE 448 – FPGA and ASIC Design with VHDL Embedded Development Kit (EDK) Processor IP, Microprocessor Peripheral Description Files System Constraint File PlatGen Data2MEM Download to FPGA Libraries Microprocessor Software Specification File Microprocessor Hardware Specification File Executable Linker C / C++ Code Compiler Bitstream VHDL / Verilog Hardware Flow ISE / Xflow Software Flow Synthesizer Object Files EDIF IP Netlists LibGen
127
127ECE 448 – FPGA and ASIC Design with VHDL Zynq - Extensible Processing Platform
128
128ECE 448 – FPGA and ASIC Design with VHDL Zynq – 7000 EPP
129
129ECE 448 – FPGA and ASIC Design with VHDL Zynq – 7000 Product Table
130
George Mason University Follow-up Courses
131
ECE Department MS in Electrical Engineering MS EE MS in Computer Engineering MS CpE COMMUNICATIONS & NETWORKING SIGNAL PROCESSING CONTROL & ROBOTICS MICROELECTRONICS/ NANOELECTRONICS SYSTEM DESIGN DIGITAL SYSTEMS DESIGN COMPUTER NETWORKS MICROPROCESSORS & EMBEDDED SYSTEMS NETWORK & SYSTEM SECURITY Programs Specializations BIOENGINEERING
132
DIGITAL SYSTEMS DESIGN 1.ECE 545 Digital System Design with VHDL (Fall) – K. Gaj, project, FPGA design with VHDL, Aldec/Synplicity/Xilinx/Altera 2. ECE 645 Computer Arithmetic (Spring) – K. Gaj, project, FPGA design with VHDL or Verilog, Aldec/Synplicity/Xilinx/Altera 3. ECE 586 Digital Integrated Circuits (Spring) – D. Ioannou 4. ECE 681 VLSI Design for ASICs (Fall) – H. Homayoun, project/lab, front-end and back-end ASIC design with Synopsys tools 5. ECE 682 VLSI Test Concepts (Spring) – T. Storey, homework 6. ECE 699 Digital Signal Processing Hardware Architectures (Fall) – A. Cohen, project, FPGA design with VHDL or Verilog
133
Possible New Graduate Computer Engineering Courses 5xx Digital System Design with Verilog 6xx Reconfigurable Computing (looking for instructors)
134
NETWORK AND SYSTEM SECURITY 1.ECE 542 Computer Network Architectures and Protocols (Fall, Spring) – S.-C. Chang, et al. 2.ECE 646 Cryptography and Computer Network Security (Fall) – K. Gaj, J-P. Kaps – lab, project: software/hardware/analytical 3.ECE 746 Advanced Applied Cryptography (every 2 nd Spring, 2013) – K. Gaj, J-P. Kaps – lab, project: software/hardware/analytical 4.ECE 699 Cryptographic Engineering (every 2 nd Spring, 2014) – J-P. Kaps – lectures + student/invited guests seminars 5.ISA 656 Network Security (Fall, Spring) – A. Stavrou
135
ECE 645 Computer Arithmetic Instructor: Dr. Kris Gaj
136
Advanced digital circuit design course covering addition and subtraction multiplication division and modular reduction exponentiation Efficient architectures for Integers unsigned and signed Real numbers fixed point single and double precision floating point Elements of the Galois field GF(2 n ) polynomial base
137
At the end of this course you should be able to: Understand mathematical and gate-level algorithms for computer addition, subtraction, multiplication, division, and exponentiation Understand tradeoffs involved with different arithmetic architectures between performance, area, latency, scalability, etc. Synthesize and implement computer arithmetic blocks on FPGAs Be comfortable with different number systems, and have familiarity with floating-point and Galois field arithmetic for future study Understand sources of error in computer arithmetic and basics of error analysis This knowledge will come about through homework, projects and practice exams. Course Objectives
138
1. Applications of computer arithmetic algorithms. Initial Discussion of Project Topics. INTRODUCTION Lecture topics
139
1.Basic addition, subtraction, and counting 2.Addition in Xilinx and Altera FPGAs 3. Carry-lookahead, carry-select, and hybrid adders 4. Adders based on Parallel Prefix Networks 5.Pipelined Adders 6.Modular addition and subtraction ADDITION AND SUBTRACTION
140
MULTIOPERAND ADDITION 1. Carry-save adders 2. Wallace and Dadda Trees 3. Adding multiple unsigned and signed numbers
141
Unsigned Integers Signed Integers Fixed-point real numbers Floating-point real numbers Elements of the Galois Field GF(2 n ) NUMBER REPRESENTATIONS
142
LONG INTEGER ARITHMETIC 1.Modular Exponentiation 2.Montgomery Multipliers and Exponentiation Units
143
MULTIPLICATION 1. Tree and array multipliers 2. Sequential multipliers 3. Multiplication of signed numbers and squaring 4. Multiplication in Xilinx and Altera FPGAs - using distributed logic - using embedded multipliers - using DSP blocks 5. Multiple clock systems
144
DIVISION 1.Basic restoring and non-restoring sequential dividers 2. SRT and high-radix dividers 3. Array dividers 4. Division by Convergence
145
FLOATING POINT AND GALOIS FIELD ARITHMETIC 1.Floating-point units 2. Galois Field GF(2 n ) units
146
ECE 682 VLSI Test Concepts Instructor: Dr. Tom Storey
147
Course Description Broad introduction to basic concepts, techniques, and tools of modern VLSI testing. Fundamentals of defect modeling, fault simulation, design for testability, built-in self-test techniques, and failure analysis. Test economics, physical defects and fault modeling, automated test pattern generation, fault simulation, design for test, built-in self test, memory test, PLD test, mixed-signal test, Iddq test, boundary scan and related standards, test synthesis, diagnosis and failure analysis, automated test equipment, embedded core test.
148
Course Text
149
Course Topics Introduction to Test Methods, Test Equipment, and the Economics of Test Fault and Defect Modeling Logic Test GenerationFault Simulation Memory TestDesign for Testability Advanced Testing MethodsFuture of VLSI Test
150
Course Changes New Text –Updated to reflect advances in state of the art –Covers a broader range of test topics –More engaging text, figures Course Content –Redone to reflect textbook change –Added developments since text was written –More emphasis on industry examples/war stories
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.