Presentation is loading. Please wait.

Presentation is loading. Please wait.

Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.

Similar presentations


Presentation on theme: "Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223."— Presentation transcript:

1 Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223

2 Notes on These Slides Altera has disclosed the details of their devices both in online documentation and academic papers The academic papers evaluate different design decisions and tradeoffs; the experiments are a bit too specialized for this course. – Please do not overly emphasize the experimentation in your studies

3 The Stratix TM Routing and Logic Architecture D.M. Lewis, et al., International Symposium on FPGAs, 2003 Online documentation

4 Altera Stratix FPGA

5 Stratix Logic Element (LE)

6 Register Feedback Mode

7 Register Cascade (Shift Regs.)

8 Logic Array Block (LAB)

9 Directionally Biased Routing Long vertical wires require power drivers – Fewer vertical wires More rows than columns – More demand for horizontal wires

10 The Stratix II Logic and Routing Architecture D.M. Lewis, et al., International Symposium on FPGAs, 2005 Online documentation

11

12 Logic Array Block (LAB)

13 Adaptive Logic Module (ALM)

14

15 Four ALM Operating Modes Normal Mode Extended LUT Mode Arithmetic Mode Shared Arithmetic Mode

16 Normal Mode

17 LUT Input Utilization

18 Extended LUT Mode Some 7-input logic functions

19 Arithmetic Mode

20 Arithmetic Mode Example R = (X < Y) ? Y : X (X < Y) Compute X-Y using the carry chain Only look at the carry output Use the carry output to select either X or Y accordingly Configure the LUTs to pass X through unmodified, and ignore the carry chain outputs

21 Shared Arithmetic Mode (3-input Add)

22 Register Chain (Shift Registers) Separates logic and shift register functions Cycle 1 Combination logic Cycles 2..k+1 Shift by k …

23 ALM Benefits Reduced LAB area by 2.6% compared to Stratix 15% performance improvement When shrinking from a 0.13um(Stratix) to 90nm (Stratix II) technology node – 51% performance improvement – 50% area decrease

24 TriMatrix Embedded Memories

25 M512 RAM Block Functions 1-port RAM 2-port RAM FIFO ROM Shift Register 576 RAM bits (32 x 18), includes parity bits

26 M4K RAM Block 4,608 RAM bits (128 x 36), includes parity bits Functions 1-port RAM 2-port RAM True 2-port RAM FIFO ROM Shift Register

27 M-RAM Block 589,824 RAM bits (4K x 144), includes parity bits Functions 1-port RAM 2-port RAM True 2-port RAM FIFO

28 MRAM LAB Interface

29 DSP Blocks Eight 9x9 multipliers Four 18x18 multipliers One 36x36 multiplier

30 Add/Sub/Accum Functions Multiplier Multiply-Accum AB + CD AB + CD + EF + GH DSP Block Internals

31 DSP Block Interconnect Interface

32 Architectural Enhancements in Stratix-III TM and Stratix-IV TM D.M. Lewis, et al., International Symposium on FPGAs, 2009 Online documentation (Stratix III) Online documentation (Stratix IV)

33 New Features Programmable power management LUT-RAM LUT-Register Mode Enhanced DSP Block

34 Programmable Body Bias Control Large regions Less body bias control circuitry Small regions Fine-grained power mgmt

35 Power Efficiency

36 LUT-RAM SRAM x y Idea Use the SRAM bits as memory Granularity is LAB-wide What is needed? Write capability Signals for address and data for the write path

37 LUT-RAM Architecture Supports one read + one write in a single cycle

38 MLAB vs. LAB

39 ALM LUT-Register Mode https://upload.wikimedia.org/wikipedia/commons/c /c6/R-S_mk2.gif

40 ALM LUT-Register Mode

41 DSP Block Capabilities High-performance, power-optimized, fully registered and pipelined multiplication operations Natively supported 9-bit, 12-bit, 18-bit, and 36-bit wordlengths Natively supported 18-bit complex multiplications Efficiently supported floating-point arithmetic formats (24-bit for single precision and 53-bit for double precision) Signed and unsigned input support Built-in addition, subtraction, and accumulation units to combine multiplication results efficiently Cascading 18-bit input bus to form tap-delay line for filtering applications Cascading 44-bit output bus to propagate output results from one block to the next block without external logic support Rich and flexible arithmetic rounding and saturation units Efficient barrel shifter support Loopback capability to support adaptive filtering

42 DSP Block Overview

43 Multiply-Add

44 4-Multiply Add w/Accumulation

45 Cascading Output for FIR Filters

46 Full DSP Block

47 Half-DSP Block Architecture

48 Four 9-bit Independent Half-DSP Multiplier Mode

49 Three 12-bit Independent Half-DSP Multiplier Mode

50 Two 18-bit Independent Half-DSP Multiplier Mode

51 36-bit Half-DSP Multiplier Mode

52 54x54-bit Multiplier Mode Used for double-precision floating-point

53 Architectural Enhancements in Stratix-V TM D.M. Lewis, et al., International Symposium on FPGAs, 2013 Online documentation

54 Larger MLAB/LUT-RAM

55 4 Flip-Flops per ALM

56 Embedded Memories with Error Correction Codes (ECC)


Download ppt "Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223."

Similar presentations


Ads by Google