Presentation is loading. Please wait.

Presentation is loading. Please wait.

Silicon Nanophotonic Network-On-Chip Using TDM Arbitration

Similar presentations


Presentation on theme: "Silicon Nanophotonic Network-On-Chip Using TDM Arbitration"— Presentation transcript:

1 Silicon Nanophotonic Network-On-Chip Using TDM Arbitration
Gilbert Hendry – Columbia University Johnnie Chan, Shoaib Kamil, Lenny Oliker, John Shalf, Luca P. Carloni, Keren Bergman

2 Why Photonics? OPTICS: ELECTRONICS:
Photonics changes the rules for Bandwidth, Energy, and Distance. OPTICS: Modulate/receive high bandwidth data stream once per communication event. Broadband switch routes entire multi-wavelength stream. Off-chip BW = On-chip BW for nearly same power. ELECTRONICS: Buffer, receive and re-transmit at every router. Each bus lane routed independently. (P  NLANES) Off-chip BW is pin-limited and power hungry. TX RX RX RX RX RX RX TX RX TX TX TX TX TX TX TX TX

3 Silicon Photonic Integration
Cornell, 2009 Cornell, 2005 Sandia, 2008 Ghent, 2007 Columbia, 2008

4 Photonic Networks-on-Chip
Corona Photonic Clos PhotonicTorus [U. of Wisconsin, HP] [MIT] [Columbia]

5 Ring Resonators Modulator/filter Broadband λ λ

6 Circuit-switched P-NoCs
0V 1V n-region p-region Electronic Control Ohmic Heater Thermal Control Transmission Injected Wavelengths Off-resonance profile On-resonance profile S D

7 Circuit-switched P-NoCs
Pros: Cons: Energy-efficient end-to-end transmission High bandwidth through WDM Electronic network still available for small control messages* Network-level support for secure regions Path setup latency Path setup contention (no fairness) Longer paths block more Head-of-line blocking at gateways * [G. Hendry et al. Analysis of Photonic Networks for a Chip Multiprocessor Using Scientific Applications. In NOCS, 2009]

8 Head of Line Blocking External Concentration* Core Receivers Core Core
Electronic Crossbar Control Router To/From Control plane Core Network IF Core Deserialization Receivers Tx/Rx Core Serialization Drivers Core Make it clear that the orange one is the optical switch 5-port photonic switch To/From Data plane Bidirectional Electronic Channel Bidirectional Waveguide External Concentration* * [P. Kumar et al. Exploring concentration and channel slicing in on-chip network router. In NOCS, 2009]

9 TDM Arbitration t2 tC-1 t1 t4 tC-2 t0 t3 tC-3 Time slot 0 Time slot 1
Time slot T

10 Synchronous Gateway/Control
Time slot ~ 10ns TDM sync clock ~ 100MHz fix

11 Nonblocking Network Scheduling
Time slot 0 Time slot 1 Time slot 2 Required time slots = N-1

12 [M. Petracca et al. IEEE Micro, 2008]
However… Nonblocking topology difficult to implement because of Insertion Loss [M. Petracca et al. IEEE Micro, 2008] * [J. Chan et al. Architectural Exploration of Chip-Scale Photonic Interconnection Network Designs Using Physical-Layer Analysis. JLT, May 2010

13 Scheduling Time Slots Problem: Constraints: Blocking Network
Full coverage Minimize Time Slots (most comm. per slot) Constraints: Source contention Destination contention Topology contention Say we’re doing full coverage, but specialized comm. patterns would be better. Possibly additional slide to set up why/WHEN we do this.

14 Solution: Genetic Search
Initialization Population (size P) Selection (down to size psxP) Reproduction (back to P) Mutation (still P) S S S S S Genetic search established, using it to solve this problem. This is the overall flow. Communication is source-destination pair. Slot 0: c0 Slot 1: c1 Slot N2: cN2 Slot 0: c0, c5, c7, c8 Slot 1: c23, c6, c58 Slot T: c42, c65, c1 Fitness = 1/(number of time slots)

15 Reproduction: Birds and Bees
c0, c3, c60, c19 c12, c2, c1, c60 c27, c4 c100, c82, c9 c100, c71, c9 c0 c1, c17, c23 c89, c56, c16, c63 C c0, c3, c60, c19 c12, c2, c1, c60

16 Mutation: Secret of the Ooze
c0, c3, c60, c19 c100 c27, c4 c71 c100, c71, c9 c9 c1, c17, c23 S c0, c3, c60, c19, c9 c100 c27, c4, c100 c71 c9 c1, c17, c23, c71

17 Schedule Results Pop size = 50 Mutation prob = 0.8 16-node 36-node

18 Implementation: Photonic Switch
200µm rings Total switch size = 1.4mm x 1.4mm No S->W, S->E, N->W, N->E (X-then-Y routing) Highlight dimensions, make bigger, or put in bullets. Show paths for implemented/unimplemented paths

19 Implementation: Switch Control
Width of LUT = 12 (number of rings) Length of LUT = T (number of time slots) Say something about overhead (area, power) - small

20 Implementation: Network Gateway
1. Send request 2. Grant, set x-bar and transmit to serializer 3. Receive, deserialize 4. Store in temp buffer, request to core

21 Simulation Setup PhoenixSim* – Photonic and Electronic network simulator 64 cores E-mesh, P-mesh, P-TDM Traffic Random – 32B, 1kB, 32kB messages Scientific application traces Put message sizes here. Might want pictures of each network * [Chan et al. PhoenixSim: A Simulator for Physical-Layer Analysis of Chip-Scale Photonic Interconnection Networks. In DATE 2010]

22 Results – Random Traffic
32B

23 Results – Random Traffic
32B 1kB

24 Results – Random Traffic
32B 1kB 32kB

25 Results – Scientific Applications
Benchmark Num Phases Num Messages Total Size (MB) Avg Msg Size (B) Cactus 2 285 7.3 25600 GTC 63 8.1 129796 MADbench 195 15414 86.5 5613 PARATEC 34 126059 5.4 43.3 Say first: higher is better. Maybe efficiency graph (1/et)

26 Conclusion TDM implements fairness TDM improves network utilization
Genetic Search useful for finding full-coverage static schedule Future Work: Scaling gracefully* Reducing time slots* Dynamic scheduling Contact: * [Hendry et al. Time-Division-Multiplexed Arbitration in Silicon Nanophotonic Networks-on-Chip for High Perf. CMPs. In JPDC, Jan 2011]


Download ppt "Silicon Nanophotonic Network-On-Chip Using TDM Arbitration"

Similar presentations


Ads by Google