Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook.

Slides:



Advertisements
Similar presentations
EE384y: Packet Switch Architectures
Advertisements

1 Outline  Why Maximal and not Maximum  Definition and properties of Maximal Match  Parallel Iterative Matching (PIM)  iSLIP  Wavefront Arbiter (WFA)
Router Architecture : Building high-performance routers Ian Pratt
Submitters: Erez Rokah Erez Goldshide Supervisor: Yossi Kanizo.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
Towards Simple, High-performance Input-Queued Switch Schedulers Devavrat Shah Stanford University Berkeley, Dec 5 Joint work with Paolo Giaccone and Balaji.
A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Algorithm Orals Algorithm Qualifying Examination Orals Achieving 100% Throughput in IQ/CIOQ Switches using Maximum Size and Maximal Matching Algorithms.
Making Parallel Packet Switches Practical Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science,
Fast Matching Algorithms for Repetitive Optimization Sanjay Shakkottai, UT Austin Joint work with Supratim Deb (Bell Labs) and Devavrat Shah (MIT)
1 Input Queued Switches: Cell Switching vs. Packet Switching Abtin Keshavarzian Joint work with Yashar Ganjali, Devavrat Shah Stanford University.
April 10, HOL Blocking analysis based on: Broadband Integrated Networks by Mischa Schwartz.
1 Comnet 2006 Communication Networks Recitation 5 Input Queuing Scheduling & Combined Switches.
The Concurrent Matching Switch Architecture Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)
Chapter 10 Switching Fabrics. Outline Physical Interconnection Physical box with backplane Individual blades plug into backplane slots Each blade contains.
1 ENTS689L: Packet Processing and Switching Buffer-less Switch Fabric Architectures Buffer-less Switch Fabric Architectures Vahid Tabatabaee Fall 2006.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.
CSIT560 by M. Hamdi 1 Course Exam: Review April 18/19 (in-Class)
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.
6/22/20151 CLOS-NETWORK SWITCHES. H. Jonathan Chao 6/22/2015 Page 2 A Growable Switch Configuration i j.
EE 122: Router Design Kevin Lai September 25, 2002.
Lecture 11. Matching A set of edges which do not share a vertex is a matching. Application: Wireless Networks may consist of nodes with single radios,
CS 268: Lecture 12 (Router Design) Ion Stoica March 18, 2002.
CIST560 by M. Hamdi 1 Packet Scheduling/Arbitration in Virtual Output Queues: Maximal Matching Algorithms (Part II)
Maximum Size Matchings & Input Queued Switches Sundar Iyer, Nick McKeown High Performance Networking Group, Stanford University,
COMP680E by M. Hamdi 1 Course Exam: Review April 17 (in-Class)
1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,
1 Netcomm 2005 Communication Networks Recitation 5.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Maximal.
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling.
Distributed Scheduling Algorithms for Switching Systems Shunyuan Ye, Yanming Shen, Shivendra Panwar
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
Load Balanced Birkhoff-von Neumann Switches
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
An Integrated IP Packet Shaper and Scheduler for Edge Routers MSEE Project Presentation Student: Yuqing Deng Advisor: Dr. Belle Wei Spring 2002.
Dynamic Networks CS 213, LECTURE 15 L.N. Bhuyan CS258 S99.
CS 552 Computer Networks IP forwarding Fall 2005 Rich Martin (Slides from D. Culler and N. McKeown)
ATM SWITCHING. SWITCHING A Switch is a network element that transfer packet from Input port to output port. A Switch is a network element that transfer.
1 Copyright © Monash University ATM Switch Design Philip Branch Centre for Telecommunications and Information Engineering (CTIE) Monash University
High Speed Stable Packet Switches Shivendra S. Panwar Joint work with: Yihan Li, Yanming Shen and H. Jonathan Chao New York State Center for Advanced Technology.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
Enabling Class of Service for CIOQ Switches with Maximal Weighted Algorithms Thursday, October 08, 2015 Feng Wang Siu Hong Yuen.
Summary of switching theory Balaji Prabhakar Stanford University.
Sami Al-wakeel 1 Data Transmission and Computer Networks The Switching Networks.
The Router SC 504 Project Gardar Hauksson Allen Liu.
Routers. These high-end, carrier-grade 7600 models process up to 30 million packets per second (pps).
ISLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002.
Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.
Crossbar Switch Project
Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.
Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers (Cisco)
Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.
SNRC Meeting June 7 th, Crossbar Switch Scheduling Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University
Improving Matching algorithms for IQ switches Abhishek Das John J Kim.
Topics in Internet Research: Project Scope Mehreen Alam
Reduced Rate Switching in Optical Routers using Prediction Ritesh K. Madan, Yang Jiao EE384Y Course Project.
Input buffered switches (1)
Providing QoS in IP Networks
1 Chapter 7 Network Flow Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
scheduling for local-area networks”
Packet Forwarding.
Packet Switching (basics)
Chapter 7 Network Flow Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Packet Scheduling/Arbitration in Virtual Output Queues and Others
Outline Why Maximal and not Maximum
Stability Analysis of MNCM Class of Algorithms and two more problems !
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Long Gong, Paul Tune, Liang Liu, Sen, Jun (Jim) Xu
Presentation transcript:

Pipelined Two Step Iterative Matching Algorithms for CIOQ Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York, Stony Brook

Outline Introduction Two step iterative matching for VOQ switches Pipelined two step iterative matching for CIOQ switches Simulation results Conclusions

Introduction Crossbar switches operating with fixed length packets have demonstrated advantages in high speed switching: Crossbar provides non-blocking switching capability Fixed length packets enable the switch to work in a synchronous time slot mode

Introduction In such a switch, input ports and output ports are connected by a crossbar switching fabric, which may have speedup capability. A crossbar with speedup of S can remove S packets from each input port and deliver S packets to each output port in a single time slot.

Introduction Depending on the actual speedup, blocked packets may be buffered at either the input side or the output side or both. S = 1, IQ switch S = N, OQ switch 1 < S < N, CIOQ switch

Introduction For the buffer space at the input side, traditional single FIFO queue suffers from head of line (HOL) blocking. Virtual output queued (VOQ) buffering eliminates HOL blocking by maintaining a separate queue for each output port at each input port. An IQ switch with VOQ buffering is called a VOQ switch.

Introduction The scheduling on crossbar switches can be viewed as a special case of the bipartite graph matching problem. Input ports and output ports make up of the two disjoint sets of vertices. The scheduling decisions are represented by the edges from input ports to output ports.

Introduction Maximum size matching (MSM) and maximum weight matching (MWM) maximize the throughput of the switch, but have high time complexity. To make fast scheduling decisions, iterative matching algorithms, such as parallel iterative matching (PIM) and iSLIP, were proposed.

Introduction Iterative matching algorithms obtain a maximal matching in multiple iterations, by incrementally adding input-output pairs. One iteration usually consists of three steps: Request step Grant step Accept step

Introduction Our objective is to design efficient iterative matching algorithms for CIOQ crossbar switches. Two step iterative matching for VOQ switches Pipelined two step iterative matching for CIOQ switches

Outline Introduction Two step iterative matching for VOQ switches Pipelined two step iterative matching for CIOQ switches Simulation results Conclusions

Two step iterative matching for VOQ switches For three step iterative matching algorithms, each input port can send up to N requests and receive up to N grants. Thus, the accept step is necessary for each input port to choose one grant to accept.

Two step iterative matching for VOQ switches By incorporating arbitration into the request step, each input port sends only one request and receives at most one grant. The accept step can be eliminated. Advantages of two step iterative matching: Shorter scheduling time Simpler implementation Less data exchange

Two step iterative matching for VOQ switches Two step parallel iterative matching (PIM2): Request step. Each input port randomly sends a request to an output port for which it has a buffered packet. Grant step. An output port randomly grants to one request among all requests it receives. The output port marks itself and the corresponding input port as matched.

Two step iterative matching for VOQ switches As in PIM, the request or grant arbitrations of different input ports or output ports in PIM2 are independent, and can be done in parallel to accelerate the matching process. Also, PIM2 makes arbitration decisions on a random basis, so each input port has equal transmission opportunity, and fairness is achieved.

Two step iterative matching for VOQ switches Since an input port sends much less requests in one iteration, will PIM2 take more iterations than PIM to converge? Theoretical analysis and simulation results both show that, three step algorithms and two step algorithms have almost identical convergence properties.

Two step iterative matching for VOQ switches Theorem 1. Assume M >= N. Then for an N x M or M x N switch, the average number of convergence iterations of PIM2 is less than or equal to lnN+e/(e−1). The classical analysis for the average convergence iterations of PIM is log 2 N+4/3.

Two step iterative matching for VOQ switches The basic idea of two step iterative matching can be generalized to other existing three step algorithms as well. Two step version of iSLIP (iSLIP2): Request step. Each free input port sends a request to the first free output port that appears next to its round robin pointer. Grant step. Each free output port chooses the request from the first input port that appears next to its round robin pointer, and grants it to transmit. The round robin pointers are only updated in the first iteration.

Two step iterative matching for VOQ switches Similarly to iSLIP, under heavy load, the round robin pointers in iSLIP2 tend to desynchronize, so that fast scheduling can be made. The two step iterative matching can be used to schedule multicast traffic. Two step iterative matching algorithms can also be adapted to schedule packets for CIOQ switches.

Outline Introduction Two step iterative matching for VOQ switches Pipelined two step iterative matching for CIOQ switches Simulation results Conclusions

Pipelined two step iterative matching for CIOQ switches On VOQ switches, the request step and grant step of a two step iterative matching algorithm can only be executed in a sequential manner. The request logic and grant logic are only busy for a half of the total time and are not fully utilized.

Pipelined two step iterative matching for CIOQ switches For CIOQ switches with speedup of two, the scheduling algorithm needs to generate two matchings in each time slot. If the request step of one matching is coincident with the grant step of the other, and vice versa, the arbitration logics can be fully pipelined

Pipelined two step iterative matching for CIOQ switches Since the two matchings progress simultaneously, two virtual switches are needed so that each matching can independently work on one of them. This is done by adding look ahead information into each virtual queue.

Pipelined two step iterative matching for CIOQ switches In addition to the head of line (HOL) packet of each virtual queue, the second of line (SOL) packet is checked as well. The HOL matching and SOL matching are completely independent.

Pipelined two step iterative matching for CIOQ switches The proposed scheme can efficiently pipeline any two step iterative matching algorithm. Theorem 2. Any pipelined two step iterative matching algorithm on a CIOQ switch with speedup of two achieves 100% throughput for any admissible traffic.

Pipelined two step iterative matching for CIOQ switches For HOL matching, only one HOL packet can be matched among all the virtual queues of the same input or those to the same output. By rescheduling the failed HOL packets in the SOL matching, the packet delay can be further reduced.

Pipelined two step iterative matching for CIOQ switches The proposed pipeline scheme can be efficiently implemented in hardware.

Outline Introduction Two step iterative matching for VOQ switches Pipelined two step iterative matching for CIOQ switches Simulation results Conclusions

Simulation results Simulations are conducted to: verify the accuracy of the convergence iteration analysis test the performance of the two step iterative matching algorithms In the simulations, we consider: Bernoulli arrival and burst arrival uniform traffic and non-uniform traffic (hotspot traffic)

Analytical convergence result PIM and PIM2 have almost the same average convergence iterations. lnN+e/(e−1) is closer to the simulation results than log 2 N+4/3.

Input queuing delay PIM, iSLIP, PIM2, and iSLIP2 have almost the same input queuing delay. As can be expected, with pipelining and HOL rescheduling, the delay is reduced.

Transmission delay The pipelined algorithms are guaranteed to achieve 100% throughput. The pipelined algorithms with HOL rescheduling has similar delay as OQFIFO.

Convergence property All the algorithms use a similar number of iterations to converge. Under 100% Bernoulli uniform traffic, iSLIP and iSLIP2 have convergence iterations of 1.

Outline Introduction Two step iterative matching for VOQ switches Pipelined two step iterative matching for CIOQ switches Simulation results Conclusions

We analyzed the advantages of the two step iterative matching algorithms and presented PIM2 as an example for VOQ switches. We theoretically proved that the average convergence iterations of PIM2 is less than lnN + e/(e − 1), and showed by simulation that it is a more accurate estimation than the classical result log 2 N + 4/3.

Conclusions We proposed the Second of Line (SOL) matching scheme to efficiently pipeline two step iterative matching algorithms for CIOQ switches, and proved that it is guaranteed to achieve 100% throughput. In order to further reduce the packet delay, the HOL rescheduling mechanism was proposed to improve the matching chances of the SOL matching.

Thank you Questions?