1 Scheduling Reserved Traffic in Input-Queued Switches: New Delay Bounds via Probabilistic Techniques Milan Vojnović EPFL Joint work with Matthew Andrews.

Slides:



Advertisements
Similar presentations
Mobility Increase the Capacity of Ad-hoc Wireless Network Matthias Gossglauser / David Tse Infocom 2001.
Advertisements

1 Scheduling Reserved Traffic in Input-Queued Switches: New Delay Bounds via Probabilistic Techniques Matthew Andrews and Milan Vojnović Bell Labs, Lucent.
Configuring a Load-Balanced Switch in Hardware Srikanth Arekapudi, Shang-Tse (Da) Chuang, Isaac Keslassy, Nick McKeown Stanford University.
1 Transportation problem The transportation problem seeks the determination of a minimum cost transportation plan for a single commodity from a number.
Lecture 6  Calculating P n – how do we raise a matrix to the n th power?  Ergodicity in Markov Chains.  When does a chain have equilibrium probabilities?
Abhay.K.Parekh and Robert G.Gallager Laboratory for Information and Decision Systems Massachusetts Institute of Technology IEEE INFOCOM 1992.
An Optimal Lower Bound for Buffer Management in Multi-Queue Switches Marcin Bieńkowski.
Analytical Modeling and Evaluation of On- Chip Interconnects Using Network Calculus M. BAkhouya, S. Suboh, J. Gaber, T. El-Ghazawi NOCS 2009, May 10-13,
Discrete Time Markov Chains
Hidden Markov Models Fundamentals and applications to bioinformatics.
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
High-Performance Networking Group Isaac Keslassy, Nick McKeown
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
Towards Simple, High-performance Input-Queued Switch Schedulers Devavrat Shah Stanford University Berkeley, Dec 5 Joint work with Paolo Giaccone and Balaji.
Worst-case Fair Weighted Fair Queueing (WF²Q) by Jon C.R. Bennett & Hui Zhang Presented by Vitali Greenberg.
A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Algorithm Orals Algorithm Qualifying Examination Orals Achieving 100% Throughput in IQ/CIOQ Switches using Maximum Size and Maximal Matching Algorithms.
1 Input Queued Switches: Cell Switching vs. Packet Switching Abtin Keshavarzian Joint work with Yashar Ganjali, Devavrat Shah Stanford University.
*Sponsored in part by the DARPA IT-MANET Program, NSF OCE Opportunistic Scheduling with Reliability Guarantees in Cognitive Radio Networks Rahul.
April 10, HOL Blocking analysis based on: Broadband Integrated Networks by Mischa Schwartz.
The Concurrent Matching Switch Architecture Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Packet-Mode Emulation of Output-Queued Switches David Hay, CS, Technion Joint work with Hagit Attiya (CS) and Isaac Keslassy (EE)
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.
Maximum Size Matchings & Input Queued Switches Sundar Iyer, Nick McKeown High Performance Networking Group, Stanford University,
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
Surprise Quiz EE384Z: McKeown, Prabhakar ”Your Worst Nightmares in Packet Switching Architectures”, 3 units [Total time = 15 mins, Marks: 15, Credit is.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling.
Lecture II-2: Probability Review
Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
Dimitrios Konstantas, Evangelos Grigoroudis, Vassilis S. Kouikoglou and Stratos Ioannidis Department of Production Engineering and Management Technical.
References for M/G/1 Input Process
A Generalized Processor Sharing Approach to Flow Control in Integrated Services Networks: The Single-Node Case Abhay K. Parekh, Member, IEEE, and Robert.
Load Balanced Birkhoff-von Neumann Switches
EPFL, Lausanne, July 17, 2003 Ph.D. advisor: Prof. Jean-Yves Le Boudec.
Decentralised load balancing in closed and open systems A. J. Ganesh University of Bristol Joint work with S. Lilienthal, D. Manjunath, A. Proutiere and.
Hop-limited flooding over dynamic networks M. Vojnović and A. Proutiere Microsoft Research IEEE Infocom 2011, Shanghai, April 2011.
Delay Analysis for Maximal Scheduling in Wireless Networks with Bursty Traffic Michael J. Neely University of Southern California INFOCOM 2008, Phoenix,
Enabling Class of Service for CIOQ Switches with Maximal Weighted Algorithms Thursday, October 08, 2015 Feng Wang Siu Hong Yuen.
Summary of switching theory Balaji Prabhakar Stanford University.
Stochastic Processes A stochastic process is a model that evolves in time or space subject to probabilistic laws. The simplest example is the one-dimensional.
© 2009 IBM Corporation 1 Improving Consolidation of Virtual Machines with Risk-aware Bandwidth Oversubscription in Compute Clouds Amir Epstein Joint work.
1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.
MSR, Cambridge, August 5, 2003 Long-Run Behavior of Equation-Based Rate Control & Rate-Latency of Some Input-Queued Switches.
Network Design and Analysis-----Wang Wenjie Queueing System IV: 1 © Graduate University, Chinese academy of Sciences. Network Design and Analysis Wang.
1 IK1500 Communication Systems IK1500 Anders Västberg
Competitive Queue Policies for Differentiated Services Seminar in Packet Networks1 Competitive Queue Policies for Differentiated Services William.
Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
DJW. Infocom 2006 optimal scheduling algorithms for input-queued switches Devavrat Shah, MIT Damon Wischik, UCL Note. The animations in these slides have.
Order Optimal Delay for Opportunistic Scheduling In Multi-User Wireless Uplinks and Downlinks Michael J. Neely University of Southern California
Minimizing Delay in Shared Pipelines Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) Yoram Revah, Aviran Kadosh.
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
Queueing in switched networks Damon Wischik, UCL thanks to Devavrat Shah, MIT TexPoint fonts used in EMF. Read the TexPoint manual before you delete this.
SNRC Meeting June 7 th, Crossbar Switch Scheduling Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Improving Matching algorithms for IQ switches Abhishek Das John J Kim.
1 Buffering Strategies in ATM Switches Carey Williamson Department of Computer Science University of Calgary.
CS433 Modeling and Simulation Lecture 11 Continuous Markov Chains Dr. Anis Koubâa 01 May 2009 Al-Imam Mohammad Ibn Saud University.
Throughput of Internally Buffered Crossbar Switch Saturday, February 20, 2016 Mingjie Lin
Energy Optimal Control for Time Varying Wireless Networks Michael J. Neely University of Southern California
Tel Hai Academic College Department of Computer Science Prof. Reuven Aviv Markov Models for data flow In Computer Networks Resource: Fayez Gebali, Analysis.
scheduling for local-area networks”
Stability Analysis of MNCM Class of Algorithms and two more problems !
Javad Ghaderi, Tianxiong Ji and R. Srikant
An Optimal Lower Bound for Buffer Management in Multi-Queue Switches
Presentation transcript:

1 Scheduling Reserved Traffic in Input-Queued Switches: New Delay Bounds via Probabilistic Techniques Milan Vojnović EPFL Joint work with Matthew Andrews Bell Laboratories, Murray Hill, NJ LCA Seminars Talk, EPFL, March 27, 2003

2 Introduction: Input-Queued Switch input portsoutput ports II crossbar At any point in time, connectivity restricted to permutation matrices

3 Some Existing Approaches for Crossbar Scheduling maximum-weight matching (McKeown ‘96, many others) decomposition-based scheduling (Chang et al, 2000) fluid-tracking (Tabatabaee et al, ToN ’01)

4 Decomposition-Based Scheduling Given: M, a I x I rate demand matrix [m ij ] intensity of the service offered to the ij-th input/output port pair Assume M doubly sub-stochastic Constraint: crossbar Find: Decompose M into permutation matrices. Find a schedule such that intensity of the service offered to ij-th input/output port pair is at least [m ij ]

5 Decomposition-Based Sched. (cont’d) Observation: A solution to the problem ensures the service rate to be at least M in the long-run Desired Property: broadly speaking, we want a schedule to be also “smooth” (“non bursty”), that is, the transmission slots would need to be evenly offered to any input-output port pair Observation: Note, the last is a short-run property

6 A Decomposition: Birkoff/von Neumann Birkoff/von Neumann (e.g. Chvátal ‘84, p. 330): Any doubly stochastic matrix M is a convex combination of permutation matrices, that is M k is a permutation matrix  k is intensity of the k-th permutation matrix Other decompositions can be used for doubly sub-stochastic M; Birkoff/von Neumann maximizes throughput Birkoff/von Neumann applied to the switch problem by Chang et al (2000)

7 The Problem that We Study Given: M 1, M 2, …, M K a sequence of permutation matrices Find: schedules with a guarantee on their smoothness “smooth” quantified through the concept of latency defined shortly

8 Why is the Problem Important Rate provision, but also, delay-jitter guarantees for diffserv like EF (Expedited Forwarding), guarantees for MPLS, provision of a good Connection-Reservation-Table to offer guaranteed service to control traffic inside a switch

9 Related Work When load is not more than 1/4 (Giles and Hajek ‘97) a schedule exists such that each pair ij is scheduled at least once in 1/  ij When load is 1 (Chang et al ‘00) Birkoff/von Neumann decomposition + PGPS scheduling of the decomposition permutation matrices, then a bound exists (shown shortly)

10 Related Work (cont’d) Leonardi et al (Infocom’01): a maximum-weight matching switch uniformly loaded with  <1 has the mean delay Shah and Kopikare (Infocom’02): a switch with bernoulli  <1 arrivals and scheduling that at each slots picks permutation matrix uniformly at random over the entire set of I! permutation matrices has the mean delay Mean-delay results:

11 Content Method to Construct Schedules Latency definition used Latencies of 4 schedulers: Random- Permutation, Random-Phase, Random- Distortion, Poisson Competition Numerical Examples Tasting some of the Methods Used to Obtain Results Conclusion

12 Method to Construct a Schedule: Superposition of Marked Point Processes Schedule: N1:N1: N2:N2: NK:NK: N:

13 Latency of a Schedule Latency 1: For any n, m, there exists Latency 2: For any n, there exists Latency 3: There exists

14 Latency of a Schedule number of slots offered to the ij-th port pair in [0,m) m

15 It is Valuable to have an Input- Output port Characterized with Rate-Latency Is a bound on lateness of the slots offered to the ij-th port pair It is a strict (rate-latency) service curve Having an input-output port pair characterized with a service curve, enables us to use known results from Network Calculus to bound backlog and delay for appropriately characterized arrival traffic

16 Scheduler by Chang et al PGPS token arrivals tokens placed back as new arrivals Initialization: token of type k arrive at

17 Scheduler by Chang et al (cont’d) Schedule: Tokens 1: Tokens 2: Tokens K:

18 Scheduler by Chang et al (cont’d) The bound of Chang et al is almost tight One can construct an example that almost attains the bound, see the paper

19 Smooth per-permutation matrix may not mean smooth per input- output port An input-output port pair may be scheduled by more than one permutation matrix Aggregate of subset of permutation matrices may be not smoothly scheduled, even though the schedule of permutation matrices is smooth If each input-output port pair would have 1 exactly in 1 perm. matrix, then classical polling

20 Random Permutation Scheduler Schedule: Tokens 1: Tokens 2: Tokens K: copy from [0,1)... copy from [0,1)

21 Latency of Random Permutation Scheduler Result 1: Fix some 0<  <1. With probability 1-  where (for, the same estimate holds with A=1/2ln 

22 Flavor of a Way to Obtain the Result the range of Brownian bridge definition of the latency 3 period-L known result

23 Variance of the offered slots with Random Permutation

24 Random-Phase Scheduler Schedule: Tokens 1: Tokens 2: Tokens K:

25 Random-Phase Scheduler (cont’d) Result 2: Assume, intensity of each permutation matrix is an integer number of 1/L. With probability 1,

26 Random-Distortion Scheduler Schedule: Tokens 1: Tokens 2: Tokens K:

27 Random-Distortion Scheduler Result 3: Assume, intensity of each permutation matrix is an integer number of 1/L. With probability 1,

28 Poisson-Competition Scheduler Amounts to: at a slot, the permutation matrix is of type k ~ For latency 2: Waiting time of Geo/D/1 queue (known) Brownian approximation

29 Numerical Evaluations Goal: Evaluate latencies over a large set of service rate matrices (matrix M defined earlier) Algorithm to generate stochastic matrices Begin (k=0): set IxI matrix M such that [m ij ]=1/L, all ij Step (k), k=1,…,k 0 : draw i1, j1, i2, j2 uniformly at random on 1,2,…,I draw d uniformly at random on [0,min(m i1j1,m i2j2 )] [m i1j1 ]<-[m i1j1 ]-d, [m i2j2 ]<-[m i2j2 ]-d, [m i1j2 ]<-[m i1j2 ]+d, [m i2j1 ]<-[m i2j1 ]+d Evolution of M is a Markov chain One perhaps may prefer to generate M uniformly at random over the space of doubly stochastic matrices

30 Numerical Evaluations: varying switch size Ob.: except for small switch sizes, the random-phase bound is tighter than PGPS; the random-distortion bound is tightest

31 Numerical Evaluations: per port- pair latencies for a 64x64 matrix L=4096 K=2423 Ob.: the fraction is larger for the random-phase than PGPS for large enough x, the fraction is largest for the random-distortion

32 Numerical Evaluation for Random Permutation Scheduler L

33 Excerpts from the Analysis

34 Preliminaries “Good” Event: Assume: Result 1:

35 Preliminaries Cont’d Result 2: Putting the Pieces Together: G n,m is implied by the events easier to handle

36 Random-phase Scheduler Scheduler def: Assume Then Remains only to handle two events

37 Random-phase Scheduler (cont’d) Note Hoeffding Similarly Finally, sum to L, periodicity > 0

38 Random-phase Scheduler: DERANDOMIZATION Method of conditional probabilities Assume

39 Random-phase Scheduler: DERANDOMIZATION (cont’d) Result there exist In addition, if Then

40 Random-phase Scheduler: DERANDOMIZATION (cont’d) Application to our problem We showed By the method of cond. prob., it follows that the latency holds w.p.1 < 1

41 Conclusion We showed that one can obtain less pessimistic bounds on latency that hold in probability One can derandomize and obtain latencies that hold with probability 1 In many cases the obtained latencies are better than a best-known latency Approach of the Point Processes may be used to construct other schedulers Worth to try to obtain sharper results The question remains: what is the best possible latency for load larger than 1/4