Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Scheduling Algorithms for Switching Systems Shunyuan Ye, Yanming Shen, Shivendra Panwar 2015-7-161.

Similar presentations


Presentation on theme: "Distributed Scheduling Algorithms for Switching Systems Shunyuan Ye, Yanming Shen, Shivendra Panwar 2015-7-161."— Presentation transcript:

1 Distributed Scheduling Algorithms for Switching Systems Shunyuan Ye, Yanming Shen, Shivendra Panwar 2015-7-161

2 Overview Background – Problem definition, related work A randomized scheduling algorithm – Algorithm, example, proof sketch Applications – Buffered crossbar switch: DISQUO – Optoelectronic switch: HELIOS 2015-7-162

3 Scheduling Problem Objective: Find a scheduling algorithm that can sustain 100% capacity Input 1 Output 1 VOQs Switching Fabric

4 Related Work (1) Maximum Weight Matching (MWM, Tassiulas ’92) 1 1 2 2 3 3 1 1 2 2 3 3 InputsOutputs 10 15 5 10 2 6 3 8 12 1 1 2 2 3 3 1 1 2 2 3 3 InputsOutputs 15 10 12 CentralizedO(N 3 ) computations

5 Related Work (2) Randomized Scheduling Algorithm (Tassiulas ’98) Centralized O(N) computations 1 1 2 2 3 3 1 1 2 2 3 3 InputsOutputs 6 5 10 12 8 4 1 1 2 2 3 3 1 1 2 2 3 3 InputsOutputs 12 8 4 Poor Delay Performance

6 Related Work (3) iSLIP (McKoewn, ’98) – Distributed, but cannot guarantee 100% throughput LAURA (Giaccone et al., ’02) – Merge R(n) and S(n-1) – Complexity is O(NlogN) EMHW (Li et al., ’04) – Using exhaustive service matching, complexity is O(logN) Glauber dynamics work of Walrand et al., Srikant et al., Shah

7 Question? Can we have a scheduling algorithm which satisfies all the conditions: – Guaranteed 100% throughput – Low computation complexity, i.e., O(1) – Easy to implement in a distributed way

8 Randomized Scheduling Algorithm Notation – Neighbors: N(i, j) = {(i, j’) or (i’, j)} – Feasible schedule: If S ij (n) = 1, for any (k,l) in N(i,j), S kl (n) = 0 S ij (n) = 1S kl (n) = 0

9 Randomized Scheduling Algorithm S(n-1) is theschedule at time n-1 Randomly generate a feasible schedule H(n): – Pre-determined – Hamiltonian walk: It can be implemented in a distributed manner with a time complexity of O(1) S(n-1) H(n)

10 Randomized Scheduling Algorithm S(n) is generated following the rules: a) For (i, j) not in H(n), S ij (n) = S ij (n-1) b) For any (i, j) in H(n): – If (i, j) in S(n-1): S ij (n)=1, with probability p ij S ij (n)=0, with 1-p ij (p ij is a concave function of Q ij ) – If (i, j) not in S(n-1): If for any (k, l) in N(i, j), (k, l) was free – S ij (n)=1, with probability p ij – S ij (n)=0, with 1-p ij Else, S ij (n) = 0 S(n-1) H(n) Stay the same

11 Randomized Scheduling Algorithm Example S(n)H(n+1) For (1, 3): none of its neighbors was active S 13 (n+1) = 1, with P 13 S 13 (n+1) = 0, with 1-P 13 S 13 (n+1) = 1, in the example For (2, 1): it was in S(n-1) S 21 (n+1) = 1, with P 21 S 21 (n+1) = 0, with 1-P 21 S 21 (n+1) = 1, in the example For (3, 2): the same as (1, 3) S 32 (n+1) = 0, in the example S(n+1)

12 Intuitive Explanation When (i, j) is picked by H(n), and none of its neighbors was active in the previous slot, (i, j) can decide to be active or not with a probability. If (i, j) becomes active, all of its neighbors are blocked from being active. If we define the probability as a concave function of Q ij, longer queues have a higher probability to become active (and a lower probability to be blocked by short queues). The weight of active VOQs will be very close to the maximum after the system converges.

13 Intuitive Explanation Example A higher probability that the schedule is {(1,2), (2, 1)} Q 11 = 1 Q 12 = 10 Q 21 = 8 Q 22 = 2 p ij = log(Q ij ) / [1+ log(Q ij )] With p 11 = 0, S 11 = 1 With p 22 = 0.4, S 22 = 1 With p 12 = 0.7, S 12 = 1 With p 21 = 0.8, S 21 = 1

14 System Stability Sketch of proof of system stability – Define the state of the system as the schedule S(n) – S(n-1), S(n), S(n+1) is a Markov chain, and it is time reversible, which implies a product-form stationary distribution. – For any admissible Bernoulli arrival traffic, the weight of S(n) is always close to the maximum weight S*(n), after the system converges. – System can be proved to be stable.

15 DISQUO Scheduling Algorithm DISQUO is a distributed implementation for a buffered crossbar switch Advantages: – Totally distributed without message passing – Delay performance is very good Drawback: – N 2 crosspoint buffers are needed

16 Buffered Crossbar Switch Input scheduler and output scheduler can be independent, and thus distributed. Output N 1 2 N … Input 2 Input N … Output 1 Output 2 Input 1 … CB ij VOQ ij

17 DISQUO Scheduling Algorithm Distributed Implementation Example n = m+ n = m _ If crosspoint (i, j) is active, input i and output j have to serve this crosspoint buffer. Otherwise, they can randomly pick one to serve

18 DISQUO Scheduling Algorithm Distributed Implementation Example n = (m+1)+ n = (m+1) _ Inputs and outputs can learn each other’s decisions by observing the crosspoint buffer, so that they can keep the consistency of the schedule For input 1 and 2, they have to decide whether to keep (1, 2) and (2, 1) active based on P 12 and P 21. In the example, they both decide to become inactive. For input 3, it has to decide whether to make (3, 2) active with a probability P 33 In the example, it decides to become active.

19 Simulations Uniform traffic

20 Simulations Non-uniform traffic – Throughput of RR-RR under hotspot traffic is 85%.

21 Simulations Impact of switch size – Delay is almost independent of switch size.

22 Simulations Impact of buffer size – K=1 is sufficient

23 HELIOS Scheduling Algorithm HELIOS is a distributed algorithm for a hybrid optical/electrical switch. Advantages: – Easy implementation (DWDM optical fiber) – Totally distributed without message passing – Uses an optical fabric to reduce power consumption – Guarantees 100% throughput for any admissible traffic

24 Architecture Each input is equipped with a fast tunable laser as the transmitter, which can tune to different wavelengths.

25 Architecture Each output has a fixed wavelength receiver operating in a specific WDM channel.

26 Architecture The optical fabric is a broadcast-and-select fabric.

27 The Linecard Model λ-monitor is used to sense the channels, so that the inputs know which wavelengths are being used.

28 Implementation Example

29 Simulation Under Bernoulli i.i.d. traffic, the delay performance is poor compared to MWM. But if one slot time is only a few nanoseconds, the delay is still acceptable (i.e. < 10μs)

30 Simulation Under On-Off bursty traffic, with Pareto distribution (larger α means longer burst length). The delay performance is closer to MWM.

31 Summary We proposed a scheduling algorithm with a very low computation complexity The algorithm can be easily implemented is a distributed way for different switching architectures It can guarantee 100% throughput for any admissible traffic, and for some architectures it can provide very good delay performance

32 Thank you! Q&A


Download ppt "Distributed Scheduling Algorithms for Switching Systems Shunyuan Ye, Yanming Shen, Shivendra Panwar 2015-7-161."

Similar presentations


Ads by Google