Presentation on theme: "Dynamic Topology Optimization for Supercomputer Interconnection Networks Layer-1 (L1) switch –Dumb switch, Electronic “patch panel” –Establishes hard links."— Presentation transcript:
Dynamic Topology Optimization for Supercomputer Interconnection Networks Layer-1 (L1) switch –Dumb switch, Electronic “patch panel” –Establishes hard links (real circuits) between endpoints –Does not read/respect/or otherwise understand packet boundaries or any content it transmits –Less expensive per port than Layer-2 switch due to lower complexity –No header parsing means minimal latency (end-to-end propagation delay) Layer-2 (L2) switch –Packet switch that must parse packet headers to determine which output port to route input packet to. Requires line-rate packet decisions –Capable of multiplexing/demultiplexing messages encoded as streams of packets (Layer-1 cannot do this at line-rate packet granularity) –Complexity increases costs –Delays packets due to buffering, packet header parsing, routing decisions. Definitions for Packet Switches:
Hybrid Interconnect Lower Cost (L1 is less expensive : lower L2 costs with low port counts) Lower Latency (L1 contributes little latency) Improved Fault Tolerance / Fault Recovery Optimal Scheduling (eliminate job fragmentation of SW topology) Better Shelf Life (L1 switches are usable for several generations of L2 switch technology) P1 P2 P3 P4 SW2 SW1 SW3 SW4 Nodes Layer-1 crossbar (electronic patch panel) Layer-2 switch blocks L1 crossbar connects NICs on nodes (P1-Pn) to Layer2 switch ports (SW1-SWm) L1 crossbar also connects layer 2 switch ports (SW1-SWm) to each other to form custom topological neighborhoods. Dynamically provisions custom interconnect topologies on a per-job basis.
Hybrid Interconnect (details) Lower Cost –L1 switches cost a fraction of L2 per port –Design allows bias towards cheaper, low-port-count L2 switches. –L2 switch infrastructure costs can scale linearly with system size (eg. provision optimal mesh topology for each application) Lower Latency –L1 switches form hard circuits -- contributing virtually no latency to switch fabric (propagation delay due to speed of light) -- L2 stage delays are larger due packet header parsing. –L1 light path for MEMs based optical switches requires no OEO conversion! (getting 90% of the way to all- optical switching fabric!) –Goal: single L2 switch hop for any p2p message Fault Tolerance –Lock out failing L2 switch blocks (in torus or mesh, failures induce significant runtime performance hit) Optimal Scheduling (eliminate job fragmentation) –Prevents fragmentation of the network topology jobs are scheduled that do not fit into available dense “slots” in the network topology (as much a problem for fat-tree as it is for mesh/torus… ORNL Altix example) –Optimal mesh topology can be provisioned for each job regardless of current system state Better Shelf Life –Same L1 crossbar switch can be used for multiple generations of L2 switch implementations
Investigation Plan Analyze communication topology requirements of existing DOE applications –Collaboration with Jeff Vetter (ORNL) to capture communication requirements Use captured communication requirements to define proper balance between L1 and L2 switch layers Use cost model for existing L1 and L2 switch components to predict cost benefits for hybrid infrastructure Develop communication performance models for specific codes to predict benefits for lower-latency interconnects –Collaboration with UIC/StarLight facility to test using their Glimmerglass optical crossbar switches.