Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gilbert Hendry Johnnie Chan, Daniel Brunina,

Similar presentations


Presentation on theme: "Gilbert Hendry Johnnie Chan, Daniel Brunina,"— Presentation transcript:

1 Photonic On-Chip Networks for Performance-Energy Optimized Off-Chip Memory Access
Gilbert Hendry Johnnie Chan, Daniel Brunina, Luca Carloni, Keren Bergman Lightwave Research Laboratory Columbia University New York, NY

2 Motivation The memory gap warrants a paradigm shift in how we move information to and from storage and computing elements [ [Exascale Report, 2008] Lightwave Research Lab, Columbia University 9/17/2018

3 Main Premise Current memory subsystem technology and packaging are not well-suited to future trends Networks on chip Growing cache sizes Growing bandwidth requirements Growing pin counts Lightwave Research Lab, Columbia University 9/17/2018

4 SDRAM context DIMMs controlled fully in parallel, sharing access on data and address busses Many wires/pins Matched signal paths (for delay) DIMMs made for short, random accesses Chip Lately, this is on chip [Intel] DIMM Memory Controller DIMM DIMM Lightwave Research Lab, Columbia University 9/17/2018

5 Future SDRAM context Example: Tilera TILE 64 9/17/2018
Lightwave Research Lab, Columbia University 9/17/2018

6 SDRAM DIMM Anatomy Cntrl IO 9/17/2018 DRAM_Bank DRAM_Chip Row Decoder
data DRAM_Bank Col Decoder Sense Amps Col addr/en DRAM_Chip data DRAM cell arrays IO Cntrl Row addr/en Row Decoder Banks (usually 8) Addr/cntrl Ranks SDRAM device DRAM_DIMM Lightwave Research Lab, Columbia University 9/17/2018

7 Memory Access in an Electronic NoC
Packetized, size of packet determined by router buffers message Chip Boundary NoC router Memory Controller End result: burst length in DRAM is dictated by packet size. (limited burst length). Results in short, random accesses. Burst length dictated by packet size Lightwave Research Lab, Columbia University 9/17/2018

8 Memory Control Complex DRAM control Scheduling accesses around:
Open/closed rows Precharging Refreshing Data/Control bus usage [DRAMsim, UMD] Lightwave Research Lab, Columbia University 9/17/2018

9 Experimental Setup – Electronic NoC
System: 5-port Electronic Router 2cm×2cm chip 8×8 Electronic Mesh 28 DRAM Access points (MCs) 2 DIMMs per DRAM AP Routers: 1 kb input buffers (per VC) 4 virtual channels 256 b packet size 128 b channels 32 nm tech. point (ORION) Normal Vt Vdd = 1.0 V Freq = 2.5 GHz Introduce: hypothetical CMP in 32nm. Explain we have a simulator. Traffic: DRAM: Random core-DRAM access point pairs Random read/write Uniform message sizes Poisson arrival at 1µs Modeled cycle-accurately with DRAMsim [Univ. MD] DDR MT/s 8 chips per DIMM, 8 banks per Chip, 2 ranks Lightwave Research Lab, Columbia University 9/17/2018

10 Experiment Results 269 Gb/s 9/17/2018 Make sure to mention x-axis.
Lightwave Research Lab, Columbia University 9/17/2018

11 Current Lightwave Research Lab, Columbia University 9/17/2018

12 Goal: Optically Integrated Memory
Optical Fiber Optical Transceiver Vdd, Gnd Lightwave Research Lab, Columbia University 9/17/2018

13 Advantages of Photonics
Decoupled energy-distance relationship No long traces to drive and synch with clock DRAM chips can run faster Less power Less pins on DIMM module and going into chip Eventually required by packaging constraints Waveguides can achieve dramatically higher density due to WDM DRAM can be arbitrarily distant – fiber is low loss Lightwave Research Lab, Columbia University 9/17/2018

14 Hybrid Circuit-Switched Photonic Network
Broadband 1×2 Switch [Cornell, 2008] Broadband 2×2 Switch Transmission [Shacham, NOCS ’07] Lightwave Research Lab, Columbia University 9/17/2018

15 Hybrid Circuit-Switched Photonic Network
Photonic Transmission Electronic Control Compute Lightwave Research Lab, Columbia University 9/17/2018

16 Hybrid Circuit-Switched Photonic Network
International Symposium on Networks-on-Chip 9/17/2018

17 Hybrid Circuit-Switched Photonic Network
Goal: to establish these circuit paths from anywhere on the network to any DRAM access point [Bergman, HPEC ’07] International Symposium on Networks-on-Chip 9/17/2018

18 Photonic DRAM Access DIMM DIMM DIMM 9/17/2018 Memory gateway
Fiber / PCB waveguide Memory gateway DIMM DIMM Photonic + electronic DIMM Procesor gateway To network Modulators needed to send commands to DRAM Processor / cache electronic Chi p boundary Photonic switch Modulators Practice these. generates memory control commands cntrl Memory Control Network Interface Mem cntrl To/From network Lightwave Research Lab, Columbia University 9/17/2018

19 Memory Transaction DIMM DIMM DIMM
Memory gateway DIMM 3 DIMM DIMM Procesor gateway To network 2 1 Read or write request is initiated from local or remote processor, travels on electronic network Processor Gateway forwards it to Memory gateway Memory gateway receives request 1 Processor / cache Chi p boundary Lightwave Research Lab, Columbia University 9/17/2018

20 Memory READ Transaction
4) MC receives READ command 5) Switch is setup from modulators to DIMM, and from DIMM to network 6) Path setup travels back to receiving Processor. Path ACK returns when path is set up 7) Row/Col addresses sent to DIMM optically 8) Read data returned optically 9) Path torn down, MC knows how long it will take 8 Photonic switch 7 Modulators Commands can be sent and data returned in the same switch setup 5 Control 4 8 6 Lightwave Research Lab, Columbia University 9/17/2018

21 Memory WRITE Transaction
4) MC receives WRITE command, which is also a path setup from the processor to memory gateway 5) Switch is setup from modulators to DIMM 6) Row/Col addresses sent to DIMM 7) Switch is setup from network to DIMM 8) Path ACK sent back to Processor 9) Data transmitted optically to DIMM 10) Path torn down from Processor after data transmitted 9 Photonic switch 6 Modulators Controller must alternate DRAM commands with incoming write data 5 7 Control 4 8 Lightwave Research Lab, Columbia University 9/17/2018

22 Optical Circuit Memory (OCM) Anatomy
Packe t Format λ Detector Bank DRAM_OpticalTransceiver Cntrl Data Bank ID Burst length DLL Latches Mux Addr/cntrl (25) Modulator Bank Data (64) Row address Col address drivers clk t tRCD tCL Fiber Coupling All chips accessed in parallel. OR Waveguide Coupling VDD, Gnd Lightwave Research Lab, Columbia University 9/17/2018

23 Advantages of Photonics
Decoupled energy-distance relationship No long traces to drive and synch with clock DRAM chips can run faster Less power Less pins on DIMM module and going into chip Eventually required by packaging constraints Waveguides can achieve dramatically higher density due to WDM DRAM can be arbitrarily distant – fiber is low loss Simplified memory control logic – no contending accesses, contention handled by path setup Accesses are optimized for large streams of data Lightwave Research Lab, Columbia University 9/17/2018

24 Experimental Setup - Photonic
System: Photonic Torus Tile 2cm×2cm chip 8×8 Photonic Torus 28 DRAM Access points (MCs) 2 DIMMs per DRAM AP Routers: 256 b buffers 32 b packet size 32 b channels 32 nm tech. point (ORION) High Vt Vdd = 0.8 V Freq = 1 GHz Photonics - 13λ Traffic: DRAM: Random core-DRAM access point pairs Random read/write Uniform message sizes Poisson arrival at 1µs Modeled with our event-driven DRAM model DDR MT/s 8 chips per DIMM, 8 banks per Chip Lightwave Research Lab, Columbia University 9/17/2018

25 Performance Comparison
Lightwave Research Lab, Columbia University 9/17/2018

26 Experiment #2 Random Statically Mapped Address Space 9/17/2018
Lightwave Research Lab, Columbia University 9/17/2018

27 Results Lightwave Research Lab, Columbia University 9/17/2018

28 Network Energy Comparison
Electronic Mesh Photonic Torus This only network power (no DRAM power, MC). Laser pwr more clear Power = 0.42 W Power = 13.3 W Total Power = 2.53 W (Including laser power) Lightwave Research Lab, Columbia University 9/17/2018

29 Summary Extending a photonic network to include access to DRAM looks good for many reasons: Circuit-switching allows large burst lengths and simplified memory control, for increased bandwidth. Energy efficient end-to-end transmission Alleviates pin count constraints with high-density waveguides PhotoMAN Lightwave Research Lab, Columbia University 9/17/2018


Download ppt "Gilbert Hendry Johnnie Chan, Daniel Brunina,"

Similar presentations


Ads by Google