Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) Stabilizing.

Similar presentations


Presentation on theme: "Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) Stabilizing."— Presentation transcript:

1 Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) Stabilizing Path Modification of Power-Aware On/Off Interconnection Networks

2 HPC networks (Infiniband, GbE) On/Off link activation method –Reducing power consumption of HPC networks –Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations –Cycle-accurate network simulator –Behavior of network during the path change Outline

3 20% 40% 60% 50% 30% 10% 0% Number of Supercomputers on Top500 List Percentage on Top500 List Network of High-performance computing

4 Virginia Tech's X 2,200 cores 280 th on Top500 ABE (NCSA) 9,600 cores 23 th on top500 ASCI-Q (LANL) 8,192 cores BLUEGENE/L (LLNL) 212,992 processors 2 nd on Top500 list IBA Propietary RoadRunner (LANL) 122,400 cores 1 st on Top500 Quadrics IBA TACC (Univ Texas) 251,904 cores 5 th on top500 IBA Examples 2008

5 HPC Networks  Small switches (24/48-port) provide the lowest cost per port  When 100,000 cores are connected, a large number of small switches are needed -drastically increasing the number of links - Unused and rarely-used links should be deactivated for power-aware HPCs switch host TREE 1TREE 4TREE 3TREE 2 0123 456 7 8910 11 12 13 14 15 Link aggr. using 3 links 4 paths

6 Power cons is almost constant regardless of traffic load # of activated ports dominates the power cons of switches –Power cons of port is reduced down to ZERO by port- shutdown operation Power cons of HPC switches ProductPortOther (Xbar) Total ( ratio of ports ) PC53241.214.942.9(65%) PC62242.042.591.1(53%) PC62482.156.8155.2(63%) SF-4201.032.655.4(41%) SFS7000D- SK9 1.043.466.1(34%) Unit :W GbE IB

7 HPC networks (Infiniband, GbE) On/Off link activation method –Reducing power consumption of HPC networks –Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations –Cycle-accurate network simulator –Behavior of network during the path change Outline

8 Overview of the on/off link method switch host Traffic load becomes low ( turning off a part of links) TREE 1TREE 4TREE 3TREE 2 0123 456 7 8910 11 12 13 14 15 TREE 1TREE 4TREE 3TREE 2 0123 456 7 8910 11 12 13 14 15 Network load is not always high (e.g. during computation time) Switch ports consume 40-60% of the total power of a switch

9 A runtime on/off link method Eg : port monitor, IPTraf, pilot execution How is NW stabilized during the path-update? Low or high-load links appear Selection of on/off links and paths Update of link status and paths Traffic monitoring No Yes Very crucial factor Low traffic load is detected TREE 1TREE 4TREE 3TREE 2 0123 456 78910 11 12 13 14 15 Paths: Before & After the before path is deactivated

10 0 12 3 4 5 6 Stabilizing network during the path update Network Reconfiguration (deadlock avoidance) Rold Rold is deadlock free Rnew is deadlock free Rold+Rnew may deadlock Rnew 3 0 5 1 4 6 2 NW Reconfiguration Switch Link Rold=Routing Table before the update Rnew=Routing Table after the update

11 2 6 6 2 0 1 3 4 5 Network Reconfiguration Rold Rold is deadlock free Rnew is deadlock free Rold+Rnew may cause deadlock Rnew 3 0 5 1 4 Reconfiguration Deadlock Old behind new New behind old

12 Existing NW reconf tech. on fault- tolerant networks DOUBLE-SCHEME SIMPLE RECONFIGURATION Static reconfigurationDynamic reconfiguration Traffic is stopped New routing is applied Traffic is resumed Traffic is not stopped Old and new routing coexist Difficulty to avoid deadlock High latencies STATIC RECONFIGURATION(ST)

13 Current NW Reconfigurations –SR PDA: Simple Reconfiguration: Packet Dropping Aware[Lysne08,TC] Tokens are sent before update of routing Packets are sent after updating routing tables –SR LA: Simple Reconfiguration: Latency Aware[Lysne08,IEEE TC] All new tables are distributed before using new one. Latency due to the tokens is reduced. –DS: Double Scheme[Pinkston03,TPDS] Requires 2 virtual channels. One channel have to be drained –ST:Static Reconfiguration Traffic injection is completely stopped

14 HPC Interconnects (Infiniband, GbE) On/Off link activation method –Reducing power consumption of HPC networks –Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations –Cycle-accurate network simulator –Behavior of network during the path change Outline

15 Switch model (InfiniBand) Buffered input (1KB per VL) and output (1KB per VL) ports Non-multiplexed crossbar with separate ports per VL FIFO-based crossbar arbiter per output crossbar port Round-robin arbiter per output port 100 ns routing time Link model Link Speed = 2.5 Gbps (1X links) Topologies 2D mesh networks Traffic model Packet lengths are 58 bytes Uniform Full range of traffic, from low load to saturation Simulation Environment

16 Evaluation Results We twice apply NW reconf. process to each execution: Deactivating links, after decrease the traffic injection Re-activating links, after increase the traffic injection We evaluated full range of initial traffic injection, (from low traffic-to near congestion)

17 Static Reconfiguration (ST) (a) Low Traffic Load (b) High Traffic Load Traffic load decreasesTraffic load increases Latency is high Traffic decreases, a link is deactivated Traffic increases, a link is reactivated At each on/off link operation, traffic is not stabilized in ST!!

18 SR-LA (dynamic reconfiguration) (a) Low Traffic Load (b) High Traffic Load Also, at each on/off link operation, traffic is not stabilized in SR-LA!!

19 SR-PDA (dynamic reconfiguration) (a) Low Traffic Load (b) High Traffic Load Also, at each on/off link operation, traffic is not stabilized in SR-PDA!!

20 Double Scheme (dynamic reocnfiguration) (a) Low Traffic Load (b) High Traffic Load Latency is constant Traffic load decreasesTraffic load increases Latency is constant Stabilizing the path update only in Double Scheme!!

21 DS ST SRL Larger Network (8x8 Mesh) Similar behavior!! Only Double Scheme stabilizes networks during the path update!!

22 We apply network reconfiguration techniques to power-aware on/off networks for HPC –Links consume ~63% of switch power On/off link activation reduces power It must accept the topology change –Network reconfiguration smoothly supports the path update »Stabilizing the update of new/old paths »Avoiding deadlocks of new/old paths Cycle-accurate simulation –shows its impact on the power-aware on/off networks Double Scheme (dynamic NW reconf) maintains performance, stabilizing networks, deadlock avoidance Network reconfiguration is essential for realizing the power-aware on/off networks for HPC systems Conclusions

23 Acknowledgment This work was partially supported by JST CREST (ULP-HPC: Ultra Low-Power, High-Performance Computing via Modelling and Optimization of Next Generation HPC Technologies)

24 17/17


Download ppt "Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) Stabilizing."

Similar presentations


Ads by Google