Presentation on theme: "PRISM: High-Capacity Networks that Augment Campus’ General Utility Production Infrastructure Philip Papadopoulos, PhD. Calit2 and SDSC."— Presentation transcript:
PRISM: High-Capacity Networks that Augment Campus’ General Utility Production Infrastructure Philip Papadopoulos, PhD. Calit2 and SDSC
Some Perspective on 100Gbps DDR3 1600MHz Memory DIMM = 12.8GB/s (102.4Gbps) Triton Compute nodes (24GB/node) enough memory capacity to source 100Gbps for ~2 seconds High-performance Flash 500MB/sec, about 24 Flash Drives to fill 100Gbps 250GB each (6TB total) ~ 8 100Gbps Data Oasis High-Performance Parallel File SDSC (all 10GbE) – 64 72TB each, 2GB/sec Disk-to-network – 4.6PB (102 hours/ Gbps) 100Gbps is really big from some perspectives, not so from others.
Terminating 100Gbps You land your campus, where does it go from there? What kinds of devices need to be connected?
Some history at UCSD: A Decade of Leading-edge Research Networks ITR: The OptIPuter, $15M – Smarr, PI. Papadopoulos, Ellisman UCSD Co-PIs. DeFanti, Leigh UIC Co-PIs – “If the network ceases to become a bottleneck how does that change the design of distributed programs” 2004, Quartzite: MRI:Development of Quartzite, a Campus-wide, Terabit-Class, Field- Programmable, Hybrid Switching Instrument for Comparative Studies, $1.48M – Papadopoulos, PI. Smarr, Fainman, Ford, Co-PIs – “Make the network real for OptIPuter experiments”
½ Mile SIO SDSC CRCA Phys. Sci - Keck SOM JSOE Preuss 6 th College SDSC Annex Node M Earth Sciences SDSC Medicine Engineering High School To CENIC and NLR Collocation Source: Phil Papadopoulos, SDSC; Greg Hidley, Cal-(IT) 2 OptIPuter Network(2005) SDSC Annex Juniper T Tbps Backplane Bandwidth 20X Chiaro Estara 6.4 Tbps Backplane Bandwidth Dedicated Fibers Between Sites Link Linux Clusters
Technology Motion Chiaro (out of business) – Replaced capability with Force10 E1200 – Moved physical center of network to Atkinson Hall (Calit2) Juniper T320 (Retired) – Upgraded by Campus/SDSC with pair of MX960s Endpoints replaced/upgraded over time at all sites Quartzite Introduced DWDM, all-optical, and Wavelength switching What was constant? – Fiber plant (how we utilized it moved over time) What was growing – Bigger Data at an increasing number of labs. Instrument capacity.
Next Generation (NSF Award# OCI ) NSF Campus Cyberinfrastructure Program (CC-NIE), $500K, 1/1/2013 start date, Papadopoulos. PI. Smarr Co-PI Replace Quartzite Core – Packet switch only (hybrid not required) – 10GbE, 40GbE, 100GbE Capability – “Small” switch – 11.5Tbit/s full-bisection, 1+Tbit/sec terminated in phase0 Expansion to more sites on/off campus Widen the freeway between SDSC and Calit2 – Access to SDSC/XSEDE resources – Campus has committed to 100Gb/s Internet2 connection. Prism is the natural termination network.
Expanding Network Reach for Big Data Users Phil Papadopoulos, SDSC, Calit2, PI
Prism Core Switch – Arista Networks Next Gen 7504 : What 11.5Tb/s looks like (< 3KW) This is the Prism core switch (Delivery in March 2013). Will have 10GbE (48 ports), 40GbE (36 ports), and 100GbE short-reach (2 ports). 2 Slots empty for expansion.
Physical Connections A variety of Transceiver Tech – Copper 10Gbit and 40Gbit for in machine room – SR, LR SFP+ 10GbE, in building and cross-campus – 10GbE DWDM 40KM + Passive Multiplexers Fiber conservation. Re-use of Optics for Quartzite Requires media conversion (DWDM XFPs) VERY reliable. No multiplexer failures in 5+ years. 1 Transceiver – 10GbE CWDM + Passive multiplexers SFP+ form factors (direct plug into 7504) – 40GbE LR4, QSFP+. (internally is CWDM). Choice of transceiver depends on where we are going, how much bandwidth is needed, and the connection point – E.g., Calit2 – SDSC: 12 x 10GbE (2 x LR + 10 DWDM), 2 Fiber pair. SDSC landing is 10GbE only (today).
What is our Rationale in Prism Big Data Labs have particular burst bandwidth needs – At UCSD. Number of labs today is roughly Campus backbone is 10GbE/20GbE and serves 50,000 users on a daily basis with ~80K IP addresses – One data burst data transfer on Prism would saturate the campus backbone – Protect the campus network from big data freeway users. – Provide massive network capability in a cost-effective manner Software defined networking (SDN) is emerging technology to better handle configuration – SDN via OpenFlow will be supported on Prism – Combine ability to experiment while reducing risk of complete network disruption Easily Bridge to Identified networks – Prism UCSD Production Network (20GbE bridge == Campus Backbone) – Prism XSEDE Resources (Direct connect in SDSC 7508s) – Prism Off-campus, high-capacity (e.g. ESNET, 100GbE Internet2, NLR) – Prism Biotech Mesa surrounding UCSD.
Really Pushing Data from Storage (what 800+ Gbps/sec looks like) 485Gb/s350Gb/s+ Saturation test: IOR testing through Lustre: 835 Gb/s = 104GB/sec OASIS designed to NOT be an Island. This is why we chose 10GbE instead of IB Papadopoulos set performance target of 100+GB/sec for Gordon Track 2 Proposal (submitted in 2010). Most people at SDSC thought it was “crazy” MLAG Jun 2012
Summary Big Data + High Capacity inexpensive switching + High Throughput Instruments + Significant Computing and Data Analysis Capacity all form a “perfect storm” – OptIPuter predicted this in 2002, Quartzite amplified that prediction in We are now here. You have to work on multiple ends of the problem – Devices, Networks, Cost$ Key insight: Recognize the fundamental differences between scaling challenges (e.g. Campus 50K users vs. Prism’s 500 Users (the 1%)) Build for Burst capacity