RFAD LAB, YONSEI University IEEE TRANSACTIONS ON COMPUTERS, VOL. 57, NO. 9, SEPTEMBER 2008 Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors.

Slides:



Advertisements
Similar presentations
CSE 413: Computer Networks
Advertisements

Electrical and Computer Engineering UAH System Level Optical Interconnect Optical Fiber Computer Interconnect: The Simultaneous Multiprocessor Exchange.
QuT: A Low-Power Optical Network-on-chip
A Novel 3D Layer-Multiplexed On-Chip Network
1 IK1500 Communication Systems IK1330 Lecture 3: Networking Anders Västberg
Dynamic Topology Optimization for Supercomputer Interconnection Networks Layer-1 (L1) switch –Dumb switch, Electronic “patch panel” –Establishes hard links.
1 Agenda TMA2 Feedback TMA3 T821 Bock 2. 2 Packet Switching.
REAL-TIME COMMUNICATION ANALYSIS FOR NOCS WITH WORMHOLE SWITCHING Presented by Sina Gholamian, 1 09/11/2011.
Optical communications & networking - an Overview
Module 3.4: Switching Circuit Switching Packet Switching K. Salah.
IP I/O Memory Hard Disk Single Core IP I/O Memory Hard Disk IP Bus Multi-Core IP R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R Networks.
MINIMISING DYNAMIC POWER CONSUMPTION IN ON-CHIP NETWORKS Robert Mullins Computer Architecture Group Computer Laboratory University of Cambridge, UK.
Lei Wang, Yuho Jin, Hyungjun Kim and Eun Jung Kim
Issues in System-Level Direct Networks Jason D. Bakos.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
Network-on-Chip: Communication Synthesis Department of Computer Science Texas A&M University.
1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.
1 Technology and Circuits for On-Chip Networks Dave Albonesi, Keren Bergman, Nathan Binkert, Shekhar Borkar, Chung-Kuan Cheng, Danny Cohen, Jo Ebergen,
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
COLUMBIA UNIVERSITY Interconnects Jim Tomkins: “Exascale System Interconnect Requirements” Jeff Vetter: “IAA Interconnect Workshop Recap and HPC Application.
Optical Switching Switch Fabrics, Techniques and Architectures 원종호 (INC lab) Oct 30, 2006.
Photonic Networks on Chip Yiğit Kültür CMPE 511 – Computer Architecture Term Paper Presentation 27/11/2008.
ROBERT HENDRY, GILBERT HENDRY, KEREN BERGMAN LIGHTWAVE RESEARCH LAB COLUMBIA UNIVERSITY HPEC 2011 TDM Photonic Network using Deposited Materials.
SARAN THAMPY D SARAN THAMPY D S7 CSE S7 CSE ROLL NO 17 ROLL NO 17 Optical computing.
Interconnect Networks
On-Chip Networks and Testing
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
SHAPES scalable Software Hardware Architecture Platform for Embedded Systems Hardware Architecture Atmel Roma, INFN Roma, ST Microelectronics Grenoble,
High-Performance Networks for Dataflow Architectures Pravin Bhat Andrew Putnam.
R OUTE P ACKETS, N OT W IRES : O N -C HIP I NTERCONNECTION N ETWORKS Veronica Eyo Sharvari Joshi.
Brierley 1 Module 4 Module 4 Introduction to LAN Switching.
Networks-on-Chips (NoCs) Basics
Déjà Vu Switching for Multiplane NoCs NOCS’12 University of Pittsburgh Ahmed Abousamra Rami MelhemAlex Jones.
SMART: A Single- Cycle Reconfigurable NoC for SoC Applications -Jyoti Wadhwani Chia-Hsin Owen Chen, Sunghyun Park, Tushar Krishna, Suvinay Subramaniam,
Dynamic Interconnect Lecture 5. COEN Multistage Network--Omega Network Motivation: simulate crossbar network but with fewer links Components: –N.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Data and Computer Communications Chapter 10 – Circuit Switching and Packet Switching (Wide Area Networks)
1 Optical Burst Switching (OBS). 2 Optical Internet IP runs over an all-optical WDM layer –OXCs interconnected by fiber links –IP routers attached to.
Data and Computer Communications Circuit Switching and Packet Switching.
George Michelogiannakis William J. Dally Stanford University Router Designs for Elastic- Buffer On-Chip Networks.
Understanding Networked Applications: A First Course Chapter 20 by David G. Messerschmitt.
1 CHAPTER 8 TELECOMMUNICATIONSANDNETWORKS. 2 TELECOMMUNICATIONS Telecommunications: Communication of all types of information, including digital data,
Computer Networks with Internet Technology William Stallings
COMPARISON B/W ELECTRICAL AND OPTICAL COMMUNICATION INSIDE CHIP Irfan Ullah Department of Information and Communication Engineering Myongji university,
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
Silicon Nanophotonic Network-On-Chip Using TDM Arbitration
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
NETWORKING FUNDAMENTALS. Network+ Guide to Networks, 4e2.
Unit III Bandwidth Utilization: Multiplexing and Spectrum Spreading In practical life the bandwidth available of links is limited. The proper utilization.
HPEC 2007, Lexington, MA18-20 September, 2007 On-Chip Photonic Communications for High Performance Multi-Core Processors Keren Bergman, Luca Carloni, Columbia.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Assaf Shacham, Keren Bergman, Luca P. Carloni Presented for HPCAN Session by: Millad Ghane NOCS’07.
Networks Network Components. Learning Objectives Describe different media for transmitting data and their carrying capabilities. Explain the different.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 CH. 8: SWITCHING & DATAGRAM NETWORKS 7.1.
1 Lecture 22: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO’03, Princeton A Gracefully Degrading and.
PERFORMANCE EVALUATION OF LARGE RECONFIGURABLE INTERCONNECTS FOR MULTIPROCESSOR SYSTEMS Wim Heirman, Iñigo Artundo, Joni Dambre, Christof Debaes, Pham.
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
Building manycore processor-to-DRAM networks using monolithic silicon photonics Ajay Joshi †, Christopher Batten †, Vladimir Stojanović †, Krste Asanović.
Data Communication Networks Lec 13 and 14. Network Core- Packet Switching.
OPTICAL SWITCHING BY SURYA ANJANI.Y. COMMUNICATION SYSTEMS MANIPAL UNIVERSITY-DUBAI B.E.ECE( )
1 The Latency/Bandwidth Tradeoff in Gigabit Networks UBI 527 Data Communications Ozan TEKDUR , Fall.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2000 Muhammad Waseem Iqbal Lecture # 20 Data Communication.
Exploring Concentration and Channel Slicing in On-chip Network Router
Gilbert Hendry Johnnie Chan, Daniel Brunina,
Analysis of a Chip Multiprocessor Using Scientific Applications
OPTICAL PACKET SWITCHING
Leveraging Optical Technology in Future Bus-based Chip Multiprocessors
Optical communications & networking - an Overview
Fiber Optic Transmission
Presentation transcript:

RFAD LAB, YONSEI University IEEE TRANSACTIONS ON COMPUTERS, VOL. 57, NO. 9, SEPTEMBER 2008 Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors Assaf Shacham, Member, IEEE, Keren Bergman, Senior Member, IEEE, and Luca P. Carloni, Member, IEEE A. Shacham is with Aprius Inc., 440 N. Wolfe Rd., Sunnyvale, CA K. Bergman is with the Department of Electrical Engineering, Columbia University, 500 W. 120th St., 1300 Mudd, New York, NY L.P. Carloni is with the Department of Computer Science, Columbia University, 466 Computer Science Building, 1214 Amsterdam Avenue, Mail Code: 0401, New York, NY Kim Yeo-myung

CONTENTS  I. I NTRODUCTION  II. R ELATED W ORK  III. H YBRID N O C M ICROARCHITECTURE  IV. N ETWORK D ESIGN  V. D ESIGN A NALYSIS AND O PTIMIZATION  VI. C OMPARATIVE P OWER A NALYSIS  VII. C ONCLUSION RFAD LAB, YONSEI University

I NTRODUCTION  Parallel Computational Core –New commercial release for driving performance –The role of interconnect and associated global communication infrastructure is becoming central to the chip performance  Issue of Network-on-Chip(NoC) –Large Bandwidth & stringent latency requirements –Electrical NoC can provide enough performance but required large power consumption → Photonic NoC  Photonic NoCs can deliver a dramatic reduction in power expended on intrachip global communi-cations while satisfying the high bandwidths requirements of CMPs  Hibryd NoC Architecture – Photonic + Electronic RFAD LAB, YONSEI University

R ELATED W ORK  Relative performance of optical and electrical on-chip interconnects –The penetration of on-chip optical interconnects can be envisioned in lengths larger than 1,000 times the wavelength where they can have lower power and latency than electronic interconnects  Multicore processor architecture where remote memory accesses are implemented as transactions on a global on-chip optical bus –A latency reduction as high as 50 percent for some applications and a power reduction of about 30 percent over a baseline electrical bus RFAD LAB, YONSEI University

R ELATED W ORK  An optical NoC based on a wavelength-routed crossbar –The crossbar, comprised of passive resonator devices and routing between an input-output pair, is achieved by selecting the appropriate wavelength –Problem : requires either widely tunable laser sources or large arrays of fixed-wavelength sources with fast wavelength- selection switches  Benefits of optical intrachip interconnects –While optical clock distribution networks are not especially attractive, wavelength division multiplexing (WDM) does offer interesting advantages for intrachip optical interconnects over copper in deep-submicron processes. RFAD LAB, YONSEI University

H YBRID N O C M ICROARCHITECTURE  Meaning of Hybrid –Optical + Electronic –Circuit-switched network(bulk message) + packet-switched network(short message)  Why Hybrid? –Photonic packet switching? Two necessary functions for packet switching, namely, buffering and header processing, are very difficult to implement with optical devices –Electronic NoC Problem? Electronic NoCs do have many advantages in flexibility and abundant functionality, but tend to consume high power, which scales up with the transmitted bandwidth RFAD LAB, YONSEI University

H YBRID N O C M ICROARCHITECTURE  Operation of optical circuit switching 1.Electronic control packet is transmitted → routed in the electronic network & setting up a photonic path 2.Buffering takes place for the electronic packets during the path- setup phase 3.The established paths are optical circuits between processing cores → enabling low power, low latency, high BW.  Advantage of photonic path –Bit-rate transparency : 어떤 소자가 광 신호의 전송 속도 (bit- rate) 에 관계없이 처리 할 수 있는 능력 → Dynamic power dissipation scales with the bit rate in electronics(switching power). But photonic switches switch on and off once per message and their energy dissipation does not depend on the bit rate –Low loss in optical wave guides RFAD LAB, YONSEI University

H YBRID N O C M ICROARCHITECTURE  Exploiting Photonics in NoC Design RFAD LAB, YONSEI University Optical Switch Modulator Waveguide & Fiber Coupling lens The construction of the photonic NoC in a single layer, above the metal * Torus Networks* Off-Chip Laser * Optical Clock Distribution Network * WDM (Microring-resonator structure)

H YBRID N O C M ICROARCHITECTURE  Life of a Message in the Photonic NoC 1.A write operation that takes place from a processing unit in a core to a memory that is located in another core is start. 2.As soon as the write address is known a path-setup packet is sent on the electronic control network. 3.The control packet is routed in the electronic network, reserving the photonic switches along the path for the photonic message which will follow it. 4.When the path-setup packet reaches the destination port, the photonic path is reserved and is ready to route the message. 5.A short light pulse can then be transmitted onto the waveguide in the opposite direction (from the destination to the source), signaling to the source that the path is open. 6.After the message transmission is completed, a path teardown packet is sent to free the path resources for usage by other messages. RFAD LAB, YONSEI University

N ETWORK D ESIGN (Building Blocks)  Photonic Switching Element(PSE) –Microring-resonator structure(similar device : optically pumped) –OFF state: The resonant frequency of the rings is different from the wavelength –ON state: The switch is turned on by the injection of electrical current into p-n contacts surrounding the rings –Switching time : 30 ps –Their merit lies mainly in their extremely small footprint, with ring diameters of approximately 12um, and their low power RFAD LAB, YONSEI University

N ETWORK D ESIGN (Building Blocks)  Photonic Switching Element(PSE) –4 X 4 switches (controlled by electronic circuit termed an ER) –Control packets are received in the ER, processed, and sent to their next hop, while the PSEs are switched ON and OFF accordingly –Blocking Relation is exist. (Nonblocking switches offer improved performance and simplify network management and routing.) RFAD LAB, YONSEI University

N ETWORK D ESIGN (Topology)  4 X 4 folded torus network –The communication requirements of a CMP are best served by a 2D regular topology such as a mesh or a torus –A regular 2D topology requires 5 X 5 switches which are overly complex to implement using photonic technology. –Therefore use a folded-torus topology as a base and augment it with access points for the gateways. RFAD LAB, YONSEI University

N ETWORK D ESIGN (Topology)  4 X 4 folded torus network –The access points for the gateways are designed with two goals in mind: 1) to facilitate injection and ejection without interference with the through traffic on the torus and 2) to avoid blocking between injected and ejected traffic which may be caused by the switches internal blocking.

N ETWORK D ESIGN (Topology)  4 X 4 folded torus network

N ETWORK D ESIGN (Flow Control)  XY dimension-order routing on the torus network –Path setup time is required (travel a number of ERs and undergo some processing in each hop & blocking) (nanosecond order) –The transmission latency of the optical data is very short and depends only on the group velocity of light in a silicon waveguide : 2cm – 300ps RFAD LAB, YONSEI University

D ESIGN A NALYSIS AND O PTIMIZATION  Simulation Setup –Developed POINTS (Photonic On-chip Interconnection Network Traffic Simulator) –36-core CMP, 6X6 Planar layout, 22nm CMOS tech. –The chip size is assumed to be 20 mm along its edge, so each core is 3.3 X 3.3 mm in size. –The network is a 6 X 6 folded-torus network augmented with 36 gateway access points, so it uses a matrix of 12 X 12 switches. –A propagation velocity of 15.4 ps/mm in a silicon waveguide for the optical signals –The inter-PSE delay and interrouter delay are, therefore, 13 and 220 ps, respectively –The PSE setup time is assumed to be 1 ns and the router processing latency is 600 ps RFAD LAB, YONSEI University

D ESIGN A NALYSIS AND O PTIMIZATION  Dealing with Deadlock –Deadlock : 1. 프로그램 1 이 자원 A 를 요청하여, 그것을 할당받았다. 2. 프로그램 2 가 자원 B 를 요청하여, 그것을 할당받았다. 3. 프로그램 1 이 자원 B 를 추가로 요청하였으나, 자원 B 가 다른 프로그램에 의해 사용 중이므로, 사용 가능한 상태가 될 때까지 대기 열에서 기다리고 있다. 4. 프로그램 2 가 자원 A 를 추가로 요청하였으나, 자원 A 가 다른 프로그램에 의해 사용 중이므로, 사용 가능한 상태가 될 때까지 대기 열에서 기다리고 있다.

D ESIGN A NALYSIS AND O PTIMIZATION  Optimizing Message Size –Large messages → Link utilization is compromised and serialization latency is increased. –Small messages → The relative overhead of the path-setup latency becomes too large and efficiency is degraded.

D ESIGN A NALYSIS AND O PTIMIZATION  Optimizing Message Size –The optimal DMA block size for the transactions over the photonic NoC ranges between 4 and 16 Kbytes

D ESIGN A NALYSIS AND O PTIMIZATION  Increasing Path Multiplicity

D ESIGN A NALYSIS AND O PTIMIZATION  Evaluating Path-setup Procedures –Reductions in path-setup latency translate to improved efficiency of the network interfaces and to higher average bandwidth. –t q is a major contributor to the overall setup latency –Some of the Technique is mentioned to reduce the t q. (Immediately dropping any path-setup packet that is blocked instead of buffering it)

C OMPARATIVE P OWER A NALYSIS  Power Analysis → The main motivation for the design of a photonic NoC –To evaluate this power analysis, perform a comparative high level power analysis.  Condition of Power Analysis –Same bandwidth & same number of processing core –Assume : 22nm CMOS technology, hosting 36 processing cores, each requiring a peak bandwidth 800 Gbps, average bandwidth 512 Gbps –Assume : uniform traffic model, mesh topology, XY dimension- order routing RFAD LAB, YONSEI University

C OMPARATIVE P OWER A NALYSIS  Reference Electronic NoC 1.Reading from a buffer (for high-BW, Large parallel line is required) 2.Traversing the routers’ internal crossbar, 3.Transmission across the interrouter link, 4.Writing to a buffer in the subsequent router, and 5.Triggering an arbitration decision. RFAD LAB, YONSEI University

C OMPARATIVE P OWER A NALYSIS  Proposed Photonic NoC 1.The photonic data-transfer network (6X6 CMP) Path multiplicity factor : 2 → 12 X 12 Photonic mesh (576 PSEs) Power of PSE : On state → 10 mW, Off state → no dissipation Total Power consumption (statistic) 2.Electronic Control network (6X6 CMP) Each photonic message is accompanied by two 32-bit control packets and the typical size of a message is 2 Kbytes.

C OMPARATIVE P OWER A NALYSIS  Proposed Photonic NoC 3.The electronic control network 960 Gbps BW → 40 Gbps X 24 Wavelengths → 24 modulator and receiver is required. We estimate that Silicon ring-resonator modulator, SiGe photo- detectors the energy will decrease to about 0.2 pJ/bit in the next years (Supplementary circuits that are usually required for the implementation of optical receivers(CDR,serializer etc) are not needed in an ultrashort link in which the modulation rate is equal to the chip clock rate) (The off-chip laser sources consume an estimated power of 10 mW per wavelength. Although a large number of lasers are required to exploit the bandwidth potential of the optical NoC, their power is dissipated off-chip and does not contribute to the chip power density)

C ONCLUSION  The motivation behind our work –1. Multicore processors step into an era where high bandwidth communications between large numbers of cores is a key driver of computing performance. –2. Power dissipation has clearly become the limiting factor in the design of high-performance microprocessors –3. Recent breakthroughs in the field of silicon photonics suggest that the integration of optical elements with CMOS electronics is likely to become viable in the near future.  This paper aims at laying the groundwork for future research progress by providing a complete discussion of the fundamental issues that need to be addressed to design a photonic NoC for CMPs