Silicon Nanophotonic Network-On-Chip Using TDM Arbitration

Slides:



Advertisements
Similar presentations
QuT: A Low-Power Optical Network-on-chip
Advertisements

A Novel 3D Layer-Multiplexed On-Chip Network
Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.
Flattened Butterfly Topology for On-Chip Networks John Kim, James Balfour, and William J. Dally Presented by Jun Pang.
REAL-TIME COMMUNICATION ANALYSIS FOR NOCS WITH WORMHOLE SWITCHING Presented by Sina Gholamian, 1 09/11/2011.
0 Arun Rodrigues, Scott Hemmert, Dave Resnick: Sandia National Lab (ABQ) Keren Bergman: Columbia University Bruce Jacob: U. Maryland John Shalf, Paul Hargrove:
CCNoC: On-Chip Interconnects for Cache-Coherent Manycore Server Chips CiprianSeiculescu Stavros Volos Naser Khosro Pour Babak Falsafi Giovanni De Micheli.
Montek Singh COMP Nov 10,  Design questions at various leves ◦ Network Adapter design ◦ Network level: topology and routing ◦ Link level:
Allocator Implementations for Network-on-Chip Routers Daniel U. Becker and William J. Dally Concurrent VLSI Architecture Group Stanford University.
Module 3.4: Switching Circuit Switching Packet Switching K. Salah.
L2 to Off-Chip Memory Interconnects for CMPs Presented by Allen Lee CS258 Spring 2008 May 14, 2008.
Firefly: Illuminating Future Network-on-Chip with Nanophotonics Yan Pan, Prabhat Kumar, John Kim †, Gokhan Memik, Yu Zhang, Alok Choudhary EECS Department.
Network based System on Chip Final Presentation Part B Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
Network based System on Chip Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
IP I/O Memory Hard Disk Single Core IP I/O Memory Hard Disk IP Bus Multi-Core IP R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R Networks.
MINIMISING DYNAMIC POWER CONSUMPTION IN ON-CHIP NETWORKS Robert Mullins Computer Architecture Group Computer Laboratory University of Cambridge, UK.
Lei Wang, Yuho Jin, Hyungjun Kim and Eun Jung Kim
EE 4272Spring, 2003 Chapter 9: Circuit Switching Switching Networks Circuit-Switching Networks Circuit-Switching Concept  Space-Division Switching  Time-Division.
Issues in System-Level Direct Networks Jason D. Bakos.
Hybrid Electric/Photonic Networks for Scientific Applications on Tiled CMPs Ankit Jain, Shoaib Kamil, Marghoob Mohiyuddin CS258 Final Presentation Prof.
TiZo-MAC The TIME-ZONE PROTOCOL for mobile wireless sensor networks by Antonio G. Ruzzelli Supervisor : Paul Havinga This work is performed as part of.
COLUMBIA UNIVERSITY Interconnects Jim Tomkins: “Exascale System Interconnect Requirements” Jeff Vetter: “IAA Interconnect Workshop Recap and HPC Application.
RFAD LAB, YONSEI University IEEE TRANSACTIONS ON COMPUTERS, VOL. 57, NO. 9, SEPTEMBER 2008 Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors.
Photonic Networks on Chip Yiğit Kültür CMPE 511 – Computer Architecture Term Paper Presentation 27/11/2008.
ROBERT HENDRY, GILBERT HENDRY, KEREN BERGMAN LIGHTWAVE RESEARCH LAB COLUMBIA UNIVERSITY HPEC 2011 TDM Photonic Network using Deposited Materials.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
Interconnect Networks
Physical Planning for the Architectural Exploration of Large-Scale Chip Multiprocessors Javier de San Pedro, Nikita Nikitin, Jordi Cortadella and Jordi.
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
Networks-on-Chips (NoCs) Basics
Déjà Vu Switching for Multiplane NoCs NOCS’12 University of Pittsburgh Ahmed Abousamra Rami MelhemAlex Jones.
1 Copyright © Monash University ATM Switch Design Philip Branch Centre for Telecommunications and Information Engineering (CTIE) Monash University
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Ob-Chip Networks and Testing1 On-Chip Networks and Testing-II.
Sami Al-wakeel 1 Data Transmission and Computer Networks The Switching Networks.
Network-on-Chip Introduction Axel Jantsch / Ingo Sander
Computer Networks with Internet Technology William Stallings
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
Express Cube Topologies for On-chip Interconnects Boris Grot J. Hestness, S. W. Keckler, O. Mutlu † The University of Texas at Austin † Carnegie Mellon.
CSE 661 PAPER PRESENTATION
Network on Chip - Architectures and Design Methodology Natt Thepayasuwan Rohit Pai.
COMPARISON B/W ELECTRICAL AND OPTICAL COMMUNICATION INSIDE CHIP Irfan Ullah Department of Information and Communication Engineering Myongji university,
Performance and Energy Comparison of Electrical and Hybrid Photonic Networks for CMPs Ankit Jain, Shoaib Kamil, Marghoob Mohiyuddin, John Shalf, John Kubiatowicz.
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
50 th Annual Allerton Conference, 2012 On the Capacity of Bufferless Networks-on-Chip Alex Shpiner, Erez Kantor, Pu Li, Israel Cidon and Isaac Keslassy.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
10/03/2005: 1 Physical Synthesis of Latency Aware Low Power NoC Through Topology Exploration and Wire Style Optimization CK Cheng CSE Department UC San.
Analyzing Performance Vulnerability due to Resource Denial-Of-Service Attack on Chip Multiprocessors Dong Hyuk WooGeorgia Tech Hsien-Hsin “Sean” LeeGeorgia.
Performance and Energy Comparison of Electrical and Hybrid Photonic Networks for CMPs Ankit Jain, Shoaib Kamil, Marghoob Mohiyuddin, John Shalf, John Kubiatowicz.
Yu Cai Ken Mai Onur Mutlu
Dynamic Traffic Distribution among Hierarchy Levels in Hierarchical Networks-on-Chip Ran Manevich, Israel Cidon, and Avinoam Kolodny Group Research QNoC.
HPEC 2007, Lexington, MA18-20 September, 2007 On-Chip Photonic Communications for High Performance Multi-Core Processors Keren Bergman, Luca Carloni, Columbia.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Assaf Shacham, Keren Bergman, Luca P. Carloni Presented for HPCAN Session by: Millad Ghane NOCS’07.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 CH. 8: SWITCHING & DATAGRAM NETWORKS 7.1.
Synthesis of Communication Schedules for TTEthernet-based Mixed-Criticality Systems Domițian Tămaș-Selicean 1, Paul Pop 1 and Wilfried Steiner 2 1 Technical.
Hybrid Optoelectric On-chip Interconnect Networks Yong-jin Kwon 1.
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
ECE 720T5 Fall 2012 Cyber-Physical Systems Rodolfo Pellizzoni.
Predictive High-Performance Architecture Research Mavens (PHARM), Department of ECE The NoX Router Mitchell Hayenga Mikko Lipasti.
Building manycore processor-to-DRAM networks using monolithic silicon photonics Ajay Joshi †, Christopher Batten †, Vladimir Stojanović †, Krste Asanović.
AN EFFICIENT TDMA SCHEME WITH DYNAMIC SLOT ASSIGNMENT IN CLUSTERED WIRELESS SENSOR NETWORKS Shafiq U. Hashmi, Jahangir H. Sarker, Hussein T. Mouftah and.
ESE532: System-on-a-Chip Architecture
Pablo Abad, Pablo Prieto, Valentin Puente, Jose-Angel Gregorio
Exploring Concentration and Channel Slicing in On-chip Network Router
Gilbert Hendry Johnnie Chan, Daniel Brunina,
Static and Dynamic Networks
William Stallings Data and Computer Communications
Analysis of a Chip Multiprocessor Using Scientific Applications
Presentation transcript:

Silicon Nanophotonic Network-On-Chip Using TDM Arbitration Gilbert Hendry – Columbia University Johnnie Chan, Shoaib Kamil, Lenny Oliker, John Shalf, Luca P. Carloni, Keren Bergman

Why Photonics? OPTICS: ELECTRONICS: Photonics changes the rules for Bandwidth, Energy, and Distance. OPTICS: Modulate/receive high bandwidth data stream once per communication event. Broadband switch routes entire multi-wavelength stream. Off-chip BW = On-chip BW for nearly same power. ELECTRONICS: Buffer, receive and re-transmit at every router. Each bus lane routed independently. (P  NLANES) Off-chip BW is pin-limited and power hungry. TX RX RX RX RX RX RX TX RX TX TX TX TX TX TX TX TX

Silicon Photonic Integration Cornell, 2009 Cornell, 2005 Sandia, 2008 Ghent, 2007 Columbia, 2008

Photonic Networks-on-Chip Corona Photonic Clos PhotonicTorus [U. of Wisconsin, HP] [MIT] [Columbia]

Ring Resonators Modulator/filter Broadband λ λ

Circuit-switched P-NoCs 0V 1V n-region p-region Electronic Control Ohmic Heater Thermal Control  Transmission Injected Wavelengths Off-resonance profile On-resonance profile S D

Circuit-switched P-NoCs Pros: Cons: Energy-efficient end-to-end transmission High bandwidth through WDM Electronic network still available for small control messages* Network-level support for secure regions Path setup latency Path setup contention (no fairness) Longer paths block more Head-of-line blocking at gateways * [G. Hendry et al. Analysis of Photonic Networks for a Chip Multiprocessor Using Scientific Applications. In NOCS, 2009]

Head of Line Blocking External Concentration* Core Receivers Core Core Electronic Crossbar Control Router To/From Control plane Core Network IF Core Deserialization Receivers Tx/Rx Core Serialization Drivers Core Make it clear that the orange one is the optical switch 5-port photonic switch To/From Data plane Bidirectional Electronic Channel Bidirectional Waveguide External Concentration* * [P. Kumar et al. Exploring concentration and channel slicing in on-chip network router. In NOCS, 2009]

TDM Arbitration t2 tC-1 t1 t4 tC-2 t0 t3 tC-3 Time slot 0 Time slot 1 … Time slot T

Synchronous Gateway/Control Time slot ~ 10ns TDM sync clock ~ 100MHz fix

Nonblocking Network Scheduling Time slot 0 Time slot 1 Time slot 2 Required time slots = N-1

[M. Petracca et al. IEEE Micro, 2008] However… Nonblocking topology difficult to implement because of Insertion Loss [M. Petracca et al. IEEE Micro, 2008] * [J. Chan et al. Architectural Exploration of Chip-Scale Photonic Interconnection Network Designs Using Physical-Layer Analysis. JLT, May 2010

Scheduling Time Slots Problem: Constraints: Blocking Network Full coverage Minimize Time Slots (most comm. per slot) Constraints: Source contention Destination contention Topology contention Say we’re doing full coverage, but specialized comm. patterns would be better. Possibly additional slide to set up why/WHEN we do this.

Solution: Genetic Search Initialization Population (size P) Selection (down to size psxP) Reproduction (back to P) Mutation (still P) S S S S S Genetic search established, using it to solve this problem. This is the overall flow. Communication is source-destination pair. Slot 0: c0 Slot 1: c1 … Slot N2: cN2 Slot 0: c0, c5, c7, c8 Slot 1: c23, c6, c58 … Slot T: c42, c65, c1 Fitness = 1/(number of time slots)

Reproduction: Birds and Bees c0, c3, c60, c19 c12, c2, c1, c60 c27, c4 c100, c82, c9 c100, c71, c9 c0 … … c1, c17, c23 c89, c56, c16, c63 C c0, c3, c60, c19 c12, c2, c1, c60

Mutation: Secret of the Ooze c0, c3, c60, c19 c100 c27, c4 c71 c100, c71, c9 c9 … c1, c17, c23 S c0, c3, c60, c19, c9 c100 c27, c4, c100 c71 c9 … c1, c17, c23, c71

Schedule Results Pop size = 50 Mutation prob = 0.8 16-node 36-node

Implementation: Photonic Switch 200µm rings Total switch size = 1.4mm x 1.4mm No S->W, S->E, N->W, N->E (X-then-Y routing) Highlight dimensions, make bigger, or put in bullets. Show paths for implemented/unimplemented paths

Implementation: Switch Control Width of LUT = 12 (number of rings) Length of LUT = T (number of time slots) Say something about overhead (area, power) - small

Implementation: Network Gateway 1. Send request 2. Grant, set x-bar and transmit to serializer 3. Receive, deserialize 4. Store in temp buffer, request to core

Simulation Setup PhoenixSim* – Photonic and Electronic network simulator 64 cores E-mesh, P-mesh, P-TDM Traffic Random – 32B, 1kB, 32kB messages Scientific application traces Put message sizes here. Might want pictures of each network * [Chan et al. PhoenixSim: A Simulator for Physical-Layer Analysis of Chip-Scale Photonic Interconnection Networks. In DATE 2010]

Results – Random Traffic 32B

Results – Random Traffic 32B 1kB

Results – Random Traffic 32B 1kB 32kB

Results – Scientific Applications Benchmark Num Phases Num Messages Total Size (MB) Avg Msg Size (B) Cactus 2 285 7.3 25600 GTC 63 8.1 129796 MADbench 195 15414 86.5 5613 PARATEC 34 126059 5.4 43.3 Say first: higher is better. Maybe efficiency graph (1/et)

Conclusion TDM implements fairness TDM improves network utilization Genetic Search useful for finding full-coverage static schedule Future Work: Scaling gracefully* Reducing time slots* Dynamic scheduling Contact: gilbert@ee.columbia.edu * [Hendry et al. Time-Division-Multiplexed Arbitration in Silicon Nanophotonic Networks-on-Chip for High Perf. CMPs. In JPDC, Jan 2011]