Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mohamed ABDELFATTAH Vaughn BETZ. 2 Why NoCs on FPGAs? Embedded NoCs Power Analysis 1 1 2 2 3 3.

Similar presentations


Presentation on theme: "Mohamed ABDELFATTAH Vaughn BETZ. 2 Why NoCs on FPGAs? Embedded NoCs Power Analysis 1 1 2 2 3 3."— Presentation transcript:

1 Mohamed ABDELFATTAH Vaughn BETZ

2 2 Why NoCs on FPGAs? Embedded NoCs Power Analysis 1 1 2 2 3 3

3 Interconnect 3 1. Why NoCs on FPGAs? Logic Blocks Switch Blocks Wires

4 4 1. Why NoCs on FPGAs? Logic Blocks Switch Blocks Wires Hard Blocks: Memory Multiplier Processor Hard Blocks: Memory Multiplier Processor

5 5 1. Why NoCs on FPGAs? Logic Blocks Switch Blocks Wires Hard Interfaces DDR/PCIe.. Hard Interfaces DDR/PCIe.. Interconnect still the same Hard Blocks: Memory Multiplier Processor Hard Blocks: Memory Multiplier Processor 1600 MHz 200 MHz 800 MHz

6 6 DDR3 PHY and Controller Problems: 1.Bandwidth requirements for hard logic/interfaces 2.Timing closure 1. Why NoCs on FPGAs? PCIe Controller Gigabit Ethernet 1600 MHz 200 MHz 800 MHz

7 7 DDR3 PHY and Controller Problems: 1.Bandwidth requirements for hard logic/interfaces 2.Timing closure 3.High interconnect utilization: – Huge CAD Problem – Slow compilation – Power/area utilization 4.Wire speed not scaling: – Delay is interconnect-dominated 1. Why NoCs on FPGAs? PCIe Controller Gigabit Ethernet

8 BarcelonaLos Angeles Keep the roads, but add freeways. Hard Blocks Logic Cluster Source: Google Earth

9 9 DDR3 PHY and Controller 1. Why NoCs on FPGAs? PCIe Controller Gigabit Ethernet Problems: 1.Bandwidth requirements for hard logic/interfaces 2.Timing closure 3.High interconnect utilization: – Huge CAD Problem – Slow compilation – Power/area utilization 4.Wire speed not scaling: – Delay is interconnect-dominated NoC RoutersLinks Router forwards data packet Router moves data to local interconnect

10 10 DDR3 PHY and Controller 1. Why NoCs on FPGAs? PCIe Controller Gigabit Ethernet Problems: 1.Bandwidth requirements for hard logic/interfaces 2.Timing closure 3.High interconnect utilization: – Huge CAD Problem – Slow compilation – Power/area utilization 4.Wire speed not scaling: – Delay is interconnect-dominated 5.Abstraction favours modularity: – Parallel compilation – Partial reconfiguration – Multi-chip interconnect Pre-design NoC to requirements NoC links are re-usable NoC is heavily pipelined NoC abstraction favors modularity High bandwidth endpoints known

11 11 DDR3 PHY and Controller 1. Why NoCs on FPGAs? PCIe Controller Gigabit Ethernet Latency-tolerant communication NoC abstraction favors modularity Problems: 1.Bandwidth requirements for hard logic/interfaces 2.Timing closure 3.High interconnect utilization: – Huge CAD Problem – Slow compilation – Power/area utilization 4.Wire speed not scaling: – Delay is interconnect-dominated 5.Abstraction favours modularity: – Parallel compilation – Partial reconfiguration – Multi-chip interconnect Previous work: Compelling area efficiency and performance NoCs can simplify FPGA design Does the NoC abstraction come at a high power cost?

12 12 Why NoCs on FPGAs? Embedded NoCs Power Analysis 1 1 2 2 3 3 Mixed NoCs Hard NoCs

13 2. Embedded NoCs Mixed NoC Hard NoC Soft LinksHard Routers Hard LinksHard Routers = + + = Soft NoCSoft LinksSoft Routers + =

14 14 Soft Hard FPGA CAD Tools ASIC CAD Tools Design Compiler Area Speed Power? Power Toggle rates Gate-level simulation Mixed HSPICE

15 FPGA Router 15 2. Embedded NoCs Logic blocks Baseline Router Programmable soft interconnect WidthVCsPortsBuffer 322510/VC Mixed NoCSoft LinksHard Routers + =

16 FPGA Router 16 2. Embedded NoCs 16 Mixed NoCSoft LinksHard Routers + =

17 Router 17 Assumed a mesh Can form any topology FPGA 2. Embedded NoCs Special Feature Configurable topology

18 FPGA Router 18 2. Embedded NoCs Logic blocksDedicated hard interconnectProgrammable soft interconnect 18 Hard NoCHard LinksHard Routers + =

19 FPGA Router 19 2. Embedded NoCs 19 Hard NoCHard LinksHard Routers + =

20 FPGA Router 20 2. Embedded NoCs Low-V mode 1.1 V 0.9 V Save 33% Dynamic Power Special Feature ~15% slower 20 Hard NoCHard LinksHard Routers + =

21 21 Why NoCs on FPGAs? Embedded NoCs 1 1 2 2 Power Analysis Components Analysis 3 3 System Analysis

22 22 Area Gap Speed Gap Power Gap Mixed Hard (Low-V) Soft 20X – 23X smaller 5X – 6X faster 9X11X (15X) Speed Area Speed Bisection BW 1. Power-aware design 2. NoC power budget 3. Comparison ~ 1.5% of FPGA 33% of FPGA 730 – 940 MHz 166 MHz ~ 50 GB/s ~ 10 GB/s Average 64 – NoC 1X Investigate BW and power together

23 Total BW = 250 GBps Most Efficient NoC? 23 3. Power Analysis Links Power Routers Power Wider Links, Fewer Routers

24 Total BW = 250 GBps Most Efficient NoC? 24 3. Power Analysis

25 Total BW = 250 GBps Most Efficient NoC? 25 3. Power Analysis

26 26 Soft NoCMixed NoCHard NoCHard NoC (Low-V) 17.4 W 250 GB/s total bandwidth Typical FPGA Dynamic Power 3. Power Analysis 123% How much is used for system-level communication?

27 27 Soft NoCMixed NoCHard NoCHard NoC (Low-V) 17.4 W NoC 250 GB/s total bandwidth 15% Typical FPGA Dynamic Power 3. Power Analysis 123%

28 28 3. Power Analysis NoC 17.4 W Typical FPGA Dynamic Power Soft NoCMixed NoCHard NoCHard NoC (Low-V) 250 GB/s total bandwidth 15% 123%11%

29 29 3. Power Analysis NoC 17.4 W Typical FPGA Dynamic Power Soft NoCMixed NoCHard NoCHard NoC (Low-V) 250 GB/s total bandwidth 15% 123%11% 7%

30 30 14.6 GB/s 17 GB/s DDR3 Module 1 PCIe Module 2 Full theoretical BW 126 GB/s Aggregate Bandwidth 3.5% NoC Power Budget Cross whole chip! 3. Power Analysis

31 31 11 Point-to-point Links Broadcast 11 n Multiple Masters 1 1 Mux + Arbiter n Multiple Masters, Multiple Slaves 1 1 Mux + Arbiter n n Interconnect = Just wires Interconnect = Wires + Logic Interconnect = NoC 1.. n Compare wires interconnect to NoCs 3. Power Analysis

32 32 Hard and Mixed NoCs very compelling Length of 1 NoC Link 1 % area overhead on Stratix 5 Runs at 730-943 MHz Power on-par with simplest FPGA interconnect 3. Power Analysis 200 MHz High Performance / Packet Switched

33 1 1 2 2 3 3 Big city needs freeways to handle traffic Area: 20-23X Why NoCs on FPGAs? Embedded NoCs: Mixed & Hard Power Analysis Speed: 5-6XPower: 9-15X Power-aware design of embedded NoCs Power Budget for 100 GB/s: 3-7% Point-to-point soft Links: 4.7 mJ/GB Embedded NoCs: 4.5 – 10.4 mJ/GB

34 34 eecg.utoronto.ca/~mohamed/noc_designer.html

35 35 eecg.utoronto.ca/~mohamed/noc_designer.html

36


Download ppt "Mohamed ABDELFATTAH Vaughn BETZ. 2 Why NoCs on FPGAs? Embedded NoCs Power Analysis 1 1 2 2 3 3."

Similar presentations


Ads by Google