Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim.

Similar presentations


Presentation on theme: "1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim."— Presentation transcript:

1 1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim Korean Advanced Institute of Science and Technology

2 2 Overview Indirect adaptive routing (IAR) –Allow adaptive routing decision to be based on local and remote congestion information Main contributions –Three new IAR algorithms for large scale networks –Steady state and transient performance evaluations –Impact of network configurations –Cost of implementation

3 3 Presentation Outline Background –The dragonfly network –Adaptive routing Indirect adaptive routing algorithms Performance results Implementation considerations

4 4 Group1 0 2 … Global Network … …… Local Network Router1 2 The Dragonfly Network High Radix Network –High radix routers –Small network diameter Each router –Three types of channels –Directly connected to a few other groups Each group –Organized by a local network –Large number of global channels (GC) Large network with a global diameter of one Router0 p0 p1 …

5 5 Routing on the Dragonfly Minimal Routing (MIN) 1.Source local network 2.Global network 3.Destination local network Some Adversarial traffic congests the global channels –Each group i sends all packets to group i+1 Oblivious solution: Valiant’s Algorithm (VAL) –Poor performance on benign traffic Router0 Group1 0 … 2 … p0 p1 … …… Router1 2 Congestion

6 6 Adaptive Routing Choose between the MIN path and a VAL path at the packet source [Singh'05] –Decision metric: path delay –Delay: product of path distance and path queue depth Measuring path queue length is unrealistic Use local queues length to approximate path –Require stiff backpressure Source Router q0q1 q2q3 MIN GC VAL GC Congestion

7 7 Adaptive Routing: Worst Case Traffic 00.10.20.30.40.5 100 150 200 250 300 350 400 450 Throughput (Flit Injection Rate) Packet Latency (Simulation cycles) Valiant’s Minimal Adaptive

8 8 Indirect Adaptive Routing Improve routing decision through remote congestion information Previous method: –Credit round trip [Kim et. al ISCA’08] Three new methods: –Reservation –Piggyback –Progressive

9 9 Credit Round Trip (CRT) Delay the return of local credits to the congested router Creates the illusion of stiffer backpressure Drawbacks –Remote congestion is still inferred through local queues –Information not up to date Source Router Congestion Delayed Credits MIN GC VAL GC [Kim et. al ISCA’08]

10 10 Reservation (RES) Each global channel track the number of incoming MIN packets Injected packets creates a reservation flit Routing decision based on the reservation outcome Drawbacks –Reservation flit flooding –Reservation delay Source Router Congestion RES Flit RES Failed MIN GC VAL GC

11 11 Piggyback (PB) Local congestion broadcast –Piggybacking on each packet –Send on idle channels Congestion data compression Drawbacks –Consumes extra bandwidth –Congestion information not up to date (broadcast delay) Source Router Congestion GC Busy GC Free MIN GC VAL GC

12 12 Progressive (PAR) MIN routing decisions at the source are not final VAL decisions are final Switch to VAL when encountering congestion Draw backs –Need an additional virtual channel to avoid deadlock –Add extra hops Source Router Congestion MIN GC VAL GC

13 13 Experimental Setup Fully connected local and global networks –33 groups –1,056 nodes 10 cycle local channel latency 100 cycle global channel latency 10-flit packets

14 14 Steady State Traffic: Uniform Random 00.10.20.30.40.50.60.70.80.9 100 120 140 160 180 200 220 240 260 280 300 Throughput (Flit Injection Rate) Packet Latency (Simulation cycles) Piggyback Credit Round Trip Progressive Reservation Minimal

15 15 Steady State Traffic: Worst Case 00.10.20.30.40.5 100 150 200 250 300 350 400 450 Throughput (Flit Injection Rate) Packet Latency (Simulation cycles) Piggyback Credit Round Trip Progressive Reservation Valiant’s

16 16 Transient Traffic: Uniform Random to Worst Case 020406080100 200 300 400 500 Cycles After Transition Packet Latency Average Packet Latency per Cycle - UR to WC Progressive Piggyback 020406080100 0 50 100 Cycles After Transition % of Packets Routing Nonminimally % Packets Routing Non-minimally per Cycle - UR to WC Progressive Piggyback

17 17 Network Configuration Considerations Packet size –RES requires long packets to amortize reservation flit cost –Routing decision is done on per packet basis Channel latency –Affects information delay (CRT, PB) –Affects packet delay (PAR, RES) Network size –Affects information bandwidth overhead (RES, PB) Global diameter greater than one –Need to exchange congestion information on the global network

18 18 Cost Considerations Credit round trip –Credit delay tracker for every local channel Reservation –Reservation counter for every global channel –Additional buffering at the injection port to store packets waiting for reservation Piggyback –Global channel lookup table for every router –Increase in packet size Progressive –Extra virtual channel for deadlock avoidance

19 19 Conclusion Three new indirect adaptive routing algorithms for large scale networks Performance and design evaluation of the algorithms Best Algorithm? –Piggyback performed the best under steady state traffic –Progressive responded fastest to transient changes –Network configurations will affect some algorithm performance –Cost of implementation

20 20 Thank You! Questions?

21 21 Adaptive Routing: Uniform Traffic 00.10.20.30.40.50.60.70.80.9 100 120 140 160 180 200 220 240 260 280 300 Throughput - Flit Injection Rate Packet Latency - Simulation cycles VAL MIN Adaptive

22 22 Transient Traffic: Worst Case to Uniform Random

23 23 Transient Traffic: Worst Case 1 to Worst Case 10

24 24 1000 Random Permutation Traffic 200300 0 5 10 15 20 25 30 PB Packet Latency % of 1K Permutations 200300 0 5 10 15 20 25 30 CRT Packet Latency % of 1K Permutations 200300 0 5 10 15 20 25 30 PAR Packet Latency % of 1K Permutations 200300 0 5 10 15 20 25 30 RES Packet Latency % of 1K Permutations 200300 0 5 10 15 20 25 30 VAL Packet Latency % of 1K Permutations

25 25 Effect of Packet size on RES: Worst Case Traffic

26 26 Large local network: Uniform Random 00.10.20.30.40.50.60.70.80.9 0 50 100 150 200 250 300 350 400 Throughput - Flit Injection Rate Packet Latency - Simulation cycles PB CRT MIN PAR RES

27 27 Large local network: Worst Case 00.10.20.30.40.5 0 100 200 300 400 500 600 Throughput - Flit Injection Rate Packet Latency - Simulation cycles PB CRT PAR RES VAL


Download ppt "1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim."

Similar presentations


Ads by Google