Download presentation

Presentation is loading. Please wait.

Published byLeo Owens Modified about 1 year ago

1
Weighted Random Oblivious Routing on Torus Networks Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering University of California, San Diego

2
Networks-On-Chip Chip-multiprocessors (CMPs) increasingly popular Torus, Mesh, Flattened Butterfly – candidate architectures for on-chip networks Intel Larrabee Tilera Tile64

3
Networks-On-Chip Chip-multiprocessors (CMPs) increasingly popular Torus, Mesh, Flattened Butterfly – candidate architectures for on-chip networks Folded Torus 2D Torus

4
Routing Algorithm Wishlist Ideal Optimum worst-case throughput ✔ Low latency ✔ Good average-case throughput ✔ Easy to guarantee deadlock freedom ✔ Low implementation complexity ✔ Closed-form algorithmic description ✔

5
Outline Motivation Related Work Optimal routing for rings Optimal routing for 2D torus

6
Optimal Oblivious Routing Cast as a Multi-commodity flow problem – Maximize worst-case throughput – Minimize hop-count Solve using Linear Programming Impractical for large networks – Number of paths too large (exponential) – Hard to make it deadlock-free – LP not scalable

7
Optimal Oblivious Routing IdealOptimal Oblivious Optimum worst-case throughput ✔✔ Low latency ✔✔ Good average-case throughput ✔✔ Easy to guarantee deadlock freedom ✔ X Low implementation complexity ✔ X Closed-form algorithmic description ✔ X

8
Optimal 2TURN Optimum oblivious routing with only 2TURN paths. 1,2 2,2 3,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,02,03,0

9
1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,02,03,0 Optimal 2TURN Optimum oblivious routing with only 2TURN paths. 1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,02,03,0

10
Optimal 2TURN IdealOptimal Oblivious Optimal 2TURN Optimum worst-case throughput ✔✔✔ Low latency ✔✔✔ Good average-case throughput ✔✔✔ Easy to guarantee deadlock freedom ✔ X ✔ Low implementation complexity ✔ XX Closed-form algorithmic description ✔ XX

11
Valiant Load Balancing (VAL) 2 phases of X-Y routing 1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,02,03,0

12
Improved Valiant Routing (IVAL) Phase1: X-Y, Phase2: Y-X 1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,02,03,0

13
Improved Valiant Routing (IVAL) Phase1: X-Y, Phase2: Y-X 1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,02,03,0

14
VAL and IVAL IdealOptimal Oblivious Optimal 2TURN VALIVAL Optimum worst-case throughput ✔✔✔✔✔ Low latency ✔✔✔ XX Good Average-case throughput ✔✔✔ X ✔ Deadlock freedom ✔ X ✔✔✔ Low implementation complexity ✔ XX ✔✔ Closed-form description ✔ XX ✔✔

15
Latency Comparison 13.5%

16
Evolution of W2TURN Step 1. Started with the simple case of 1D rings – Developed Weighted Random Direction (WRD) Step 2. Described 2TURN paths in IVAL in terms of routing on 1D segments (I2TURN) – I2TURN has analytical expression for hop count. Step 3. Combined the intuition gained from WRD, I2TURN and optimal 2TURN – Developed Weighted random 2TURN routing (W2TURN) – Analytically showed latency of W2TURN strictly better than I2TURN

17
Outline Motivation Related Work Optimal routing for rings Optimal routing for 2D torus

18
Routing on Rings Randomized Load Balancing (RLB) – Optimal worst-case throughput for rings Same routing strategy for both odd and even radix networks

19
Some Facts … Worst-case throughput determined by maximum channel load under most adversarial traffic For a torus network with radix k, – Maximum channel for worst-case throughput optimality = k/4 Even k = k/4 – 1/4k Odd k

20
Rings – The Difference Between Odd and Even RLB: Route minimally with probability (k-∆)/k Why can’t we route minimally more often? Total Channel load = (k-1)/2 * (k+1)/2k = k/4 - 1/4k = Maximum load for worst-case throughput optimality Tornado traffic ∆ = (k-1)/2

21
Rings – The Difference Between Odd and Even RLB: Route minimally with probability (k-∆)/k. Can we route minimally more often? Total Channel load = (k/2 – 1) * (k+2)/2k = k/4 – 1/k < Maximum load for worst-case throughput optimality Tornado traffic ∆ = k/2-1 Route minimally with a probability of (k-∆-1)/(k-2) > (k-∆)/k

22
WRD Algorithm Odd radix: – Route minimally with probability (k-∆)/k – Route non-minimally with probability ∆/k Even radix: – Route minimally with probability (k-∆-1)/(k-2) when k > 2 and ∆ > 0 – Route non-minimally with probability (∆-1)/(k-2) when k > 2 and ∆ > 0

23
Latency Evaluation 25%

24
WRD=Optimal

25
WRD - Ideal for 1D Rings IdealWRD Optimum worst-case throughput ✔✔ Low latency ✔✔ Good average-case throughput ✔✔ Easy to guarantee deadlock freedom ✔✔ Low implementation complexity ✔✔ Closed-form algorithmic description ✔✔

26
Outline Motivation Related Work Optimal routing for rings Optimal routing for 2D torus

27
I2TURN Describe 2TURN paths in terms of 1D segments. 2TURN paths: X-Y-X or Y-X-Y 1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,0 2,03,0 X-Y-X routing XSelect intermediate X position x* at uniform random Route minimally to x* YRoute using RLB on the Y ring at X=x*

28
I2TURN Describe 2TURN paths in terms of 1D segments. 2TURN paths: X-Y-X or Y-X-Y 1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,02,03,0 X-Y-X routing XSelect intermediate X position x* at uniform random Route minimally to x* YRoute using RLB on the Y ring at X=x* 1/4

29
I2TURN Describe 2TURN paths in terms of 1D segments. 2TURN paths: X-Y-X or Y-X-Y 1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,02,03,0 X-Y-X routing XSelect intermediate X position x* at uniform random Route minimally to x* YRoute using RLB on the Y ring at X=x* XRoute minimally to the destination 3/4 1/4

30
I2TURN – Main Idea For XYX routing, load balance across the Y-rings to make traffic along every Y-ring admissible Use worst-case throughput optimal routing (RLB) on the Y-ring Can easily derive analytical expression for average packet latency Can be proved to be equivalent to IVAL. Hence, it is worst-case throughput optimal Can define YXY routing by swapping dimensions

31
W2TURN – Even Radix Reduces latency over I2TURN Use WRD instead of RLB Interpolate X-Y-X and Y-X-Y 2TURN routing with minimal X-Y and Y-X routing – XYX : k/2(k+1) – YXY : k/2(k+1) – XY: 1/2(k+1) – YX: 1/2(k+1)

32
X-Y-X W2TURN 1,22,23,2 1,12,13,1 0,2 1,32,33,3 0,0 0,1 0,3 1,0 2,03,0 X-Y-X routing XSelect intermediate X position x* at uniform random Route minimally to x* YRoute using WRD on the Y ring at X=x*

33
1,3 X-Y-X W2TURN 1,22,23,2 1,12,13,1 0,2 2,33,3 0,0 0,1 0,3 1,0 2,03,0 X-Y-X routing XSelect intermediate X position x* at uniform random Route minimally to x* YRoute using WRD on the Y ring at X=x* 1

34
1,3 X-Y-X W2TURN 1,22,23,2 1,12,13,1 0,2 2,33,3 0,0 0,1 0,3 1,0 2,03,0 X-Y-X routing XSelect intermediate X position x* at uniform random Route minimally to x* YRoute using WRD on the Y ring at X=x* XRoute minimally to the destination 1 When number of hops in both directions are equal, avoid using links used by minimal X-Y or Y-X routing.

35
W2TURN – Odd Radix W2TURN = Optimal 2TURN for odd radix More elaborate description but easy to implement Uses X-Y-X and Y-X-Y 2TURN routing with equal probability Most of the intuition gained by observing optimal 2TURN paths

36
Latency Evaluation 13.5%

37
W2TURN ≈ Optimal-2TURN W2TURN = Optimal-2TURN for odd radix W2TURN within 0.72% of Optimal-2TURN for even radix

38
Back to our Wishlist … IdealOptimal Oblivious Optimal 2TURN VALIVALW2TURN Optimum worst-case throughput ✔✔✔✔✔✔ Low latency ✔✔✔ XX ✔ Good average-case throughput ✔✔✔ X ✔✔ Easy to guarantee deadlock freedom ✔ X ✔✔✔✔ Low implementation complexity ✔ XX ✔✔✔ Closed-form algorithmic description ✔ XX ✔✔✔

39
Summary of Contributions WRD: Optimal routing algorithm for rings – Worst-case throughput optimal – Minimum hop count W2TURN-Odd: Optimal 2TURN routing with a closed form description for 2D torus with odd radix W2TURN-Even: Latency within 0.072% of optimal 2TURN routing for 2D torus with even radix WRD and W2TURN are best performing closed-form algorithms for 1D and 2D torus!!

40
Thank You !!

41
Average case throughput

42
Proof of worst-case throughput optimality Optimal worst-case channel load = 2*(Channel load for uniform traffic) To prove a routing is worst-case throughput optimal, sufficient to prove that maximum channel load: = k/4 when k is even. = k/4 – 1/4k when k is odd.

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google