Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,

Similar presentations


Presentation on theme: "1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,"— Presentation transcript:

1 1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic, but known traffic matrix  Technique: Non-uniform schedule (Birkhoff-von Neumann) 4. Unknown traffic matrix  Technique: Lyapunov functions (MWM) 5. Faster scheduling algorithms  Technique: Speedup (maximal matchings)  Technique: Memory and randomization (Tassiulas)  Technique: Twist architecture (buffered crossbar) 6. Accelerate scheduling algorithm  Technique: Pipelining  Technique: Envelopes  Technique: Slicing 7. No scheduling algorithm  Technique: Load-balanced router

2 Buffered Crossbars With Performance Guarantees Taken from the 2004 Ph.D. defense of: Shang-Tse (Da) Chuang Department of Electrical Engineering, Stanford University, http://yuba.stanford.edu/~stchuang

3 3 Motivation  Network operators want performance guarantees  Throughput guarantee  Delay guarantee  High performance routers use crossbars  Hard to build crossbar-based routers with guarantees  My talk:  How a crossbar with a small amount of internal buffering can give guarantees

4 4 Contents  Throughput Guarantees  Buffered Crossbar - 100% Throughput  Buffered Crossbar - Work Conservation

5 5 Generic Crossbar-Based Architecture Speedup of S Scheduler VOQs

6 6 Admissible Traffic  Traffic Matrix  Traffic is admissible if

7 7  100% Throughput  An algorithm delivers 100% throughput if for any admissible traffic the average backlog is finite Throughput Guarantee Speedup of S Scheduler

8 8 Previous Work 19851990199520002005 Wave Front Arbiter [Tamir] Parallel Iterative Matching [Anderson et al.] iSLIP [McKeown] Longest Port First [Mekkittikul et al.] Maximum Weight Matching [McKeown et al.] Maximal Matching S=2 [Dai,Prabhakar] Heuristics Theoretically Proven

9 9 Maximal Matching Has Become Hard  TTX Switch Fabric  Uses maximal matching  Speedup less than 2  Consumes up to 8kW  Limited to ~2.5Tb/s  No 100% throughput guarantee

10 10 Traditional Crossbar  Crossbar Requirements  An input can send at most one cell  An output can receive at most one cell  Scheduling Problem  Must overcome two constraints simultaneously  New Crossbar  Relieve contention  Remove dependency between inputs and outputs

11 11 Contents  Throughput Guarantees  Buffered Crossbar - 100% Throughput  Buffered Crossbar - Work Conservation  Delay Guarantees  Traditional Crossbar – Emulating an OQ Switch  Buffered Crossbar – Emulating an OQ Switch

12 12 Buffered Crossbar  Arrival Phase  Scheduling Phases – Speedup of 2  Departure Phase

13 13 Scheduling Phase  Input Schedule  Each input selects in parallel a cell for an empty crosspoint  Output Schedule  Each output selects in parallel a cell from a full crosspoint

14 14 Example of Input/Output Scheduling  Round-robin Policy  Each input schedules in a round-robin order  Each output schedules in a round-robin order

15 15 Previous Work  Buffered Crossbar Simulations [Rojas-Cessa et al. 2001]  32x32 switch, Uniform Bernoulli Traffic, Round-Robin, S=1

16 16  Theorem 1  A buffered crossbar with speedup of 2 delivers 100% throughput for any admissible Bernoulli iid traffic using any work-conserving input/output schedules. 100% Throughput

17 17 Intuition of Proof ε <1-ε 12 1-ε ++ ε = 2- ε  When a flow is backed up, the services for this backlog exceeds the arrivals

18 18 Contents  Throughput Guarantees  Buffered Crossbar - 100% Throughput  Buffered Crossbar - Work Conservation  Delay Guarantees  Traditional Crossbar – Emulating an OQ Switch  Buffered Crossbar – Emulating an OQ Switch

19 19  Work-conserving Property  If there is a cell for a given output in the system, that output is busy. Work Conservation Output Queued (OQ) Switch

20 20 ? Emulating an OQ switch  Under identical inputs, the departure time of every cell from both switches is identical

21 21 4 Input Priority List 576 5 6 1 1 2 9 2 3 8 3 1  Label each cell with their corresponding departure times  Arrange input cells into an input priority list  Output selects crosspoint with earliest departure time 4

22 22 Input Priority List 576 56 4 132 9 4 2 1 3 1 8 2 Good guy Bad guys Bad guy  Label each cell with their corresponding departure times  Arrange input cells into an input priority list  Output selects crosspoint with earliest departure time

23 23 Definitions 576 56 2 4 132 9 4 2 1 3  Output Margin – cells at its output with earlier departure time  Input Margin – cells ahead in input priority list destined to different outputs  Total Margin – Output Margin minus Input Margin 1 8 2 good guys 2 bad guys

24 24 Emulation of FIFO OQ Switch 576 56 2 4 12 9 4 2 1 3  Scheduling Phase  Crosspoint is full – Output Margin will increase by one  Crosspoint is empty – Input Margin will decrease by one  Total Margin increases by two 1 83

25 25 Emulation of FIFO OQ Switch 576 56 2 4 12 9 4 2 1 3  Arrival Phase  Input Margin might increase by one  Departure Phase  Output Margin will decrease by one  Total Margin decreases by at most two 1 83 3

26 26 Emulation of FIFO OQ Switch 576 56 2 4 2 9 4 2 3 833  Lemma 1  For every time slot, total margin does not decrease

27 27 FIFO Insertion Policy 56 4 2 9 4 2 3 8 576 3 23 47  Arrival Phase  Cell for non-empty VOQ, insert behind cells for same output  Cell for empty VOQ, insert at head of input priority list

28 28 FIFO Insertion Policy 576 56 2 4 2 9 4 2 3 833  Lemma 2  An arriving cell will have a non-negative total margin 47

29 29  Theorem 2  A buffered crossbar with speedup of 2 can exactly emulate a FIFO OQ switch.  Result was shown independently  B. Magill, C. Rohrs, R. Stevenson, “Output-Queued Switch Emulation by Fabrics With Limited Memory”, in IEEE Journal on Selected Areas in Communications, pp.606-615, May. 2003.  Theorem 3  A buffered crossbar with speedup of 2 can be work-conserving with a distributed algorithm. Emulation of FIFO OQ Switch

30 30 Summary  Buffered crossbars  Uses crosspoints to relieve contention  Inputs and outputs schedule independently and in parallel  Performance guarantees  Throughput – any work-conserving input/output schedule  Work Conservation – simple insertion policy

31 31 Relevant Papers  Crossbars  Shang-Tse Chuang, Ashish Goel, Nick McKeown, Balaji Prabhakar, “Matching Output Queuing with a Combined Input Output Queued Switch,” IEEE Journal on Selected Areas in Communications, vol.17, n.6, pp.1030-1039, Dec.1999.  Buffered Crossbars  Shang-Tse Chuang, Sundar Iyer, Nick McKeown, “Practical Algorithms for Performance Guarantees in Buffered Crossbars,” in preparation for IEEE/ACM Transactions on Networking.

32 32 Thank you!


Download ppt "1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,"

Similar presentations


Ads by Google