1 A Deficit Round Robin 20MB/s Layer 2 Switch Muraleedhara Navada Francois Labonte.

Slides:



Advertisements
Similar presentations
CCNA3: Switching Basics and Intermediate Routing v3.0 CISCO NETWORKING ACADEMY PROGRAM Switching Concepts Introduction to Ethernet/802.3 LANs Introduction.
Advertisements

University of Calgary – CPSC 441.  We need to break down big networks to sub-LANs  Limited amount of supportable traffic: on single LAN, all stations.
NetFPGA Project: 4-Port Layer 2/3 Switch Ankur Singla Gene Juknevicius
Deficit Round Robin Scheduler. Outline Introduction Ordinary Problems Deficit Round Robin Latency of DRR Improvement of latencies.
1 CNPA B Nasser S. Abouzakhar Queuing Disciplines Week 8 – Lecture 2 16 th November, 2009.
Courtesy: Nick McKeown, Stanford 1 Intro to Quality of Service Tahir Azim.
1 Version 3 Module 8 Ethernet Switching. 2 Version 3 Ethernet Switching Ethernet is a shared media –One node can transmit data at a time More nodes increases.
A Deficit Round Robin Input Arbiter for NetFPGA Jonathan Woodruff.
Multiprocessing Memory Management
תזכורת  שבוע הבא אין הרצאה m יום א, נובמבר 15, 2009  שיעור השלמה m יום שישי, דצמבר 11, 2009 Lecture 4: Nov 8, 2009 # 1.
12/13/99 Page 1 IRAM Network Interface Ioannis Mavroidis IRAM retreat January 12-14, 2000.
COMP680E by M. Hamdi 1 Course Exam: Review April 17 (in-Class)
Lecture 4#-1 Scheduling: Buffer Management. Lecture 4#-2 The setting.
1 RAMP Infrastructure Krste Asanovic UC Berkeley RAMP Tutorial, ISCA/FCRC, San Diego June 10, 2007.
Virtual LANs. VLAN introduction VLANs logically segment switched networks based on the functions, project teams, or applications of the organization regardless.
Connecting LANs, Backbone Networks, and Virtual LANs
A Scalable, Cache-Based Queue Management Subsystem for Network Processors Sailesh Kumar, Patrick Crowley Dept. of Computer Science and Engineering.
Paper Review Building a Robust Software-based Router Using Network Processors.
Chapter 4: Managing LAN Traffic
Section 4 : The OSI Network Layer CSIS 479R Fall 1999 “Network +” George D. Hickman, CNI, CNE.
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
06/04/ D Spanning Tree Compliant switch Gireesh Shrimali, Jeslin Puthenparambil EE384Y Course Project.
VLAN V irtual L ocal A rea N etwork VLAN Network performance is a key factor in the productivity of an organization. One of the technologies used to.
CMPT 471 Networking II Address Resolution IPv4 ARP RARP 1© Janice Regan, 2012.
Module 8: Ethernet Switching
1 Flow Identification Assume you want to guarantee some type of quality of service (minimum bandwidth, maximum end-to-end delay) to a user Before you do.
Cisco 3 - LAN Perrine. J Page 110/20/2015 Chapter 8 VLAN VLAN: is a logical grouping grouped by: function department application VLAN configuration is.
Addressing Queuing Bottlenecks at High Speeds Sailesh Kumar Patrick Crowley Jonathan Turner.
Fair Queueing. 2 First-Come-First Served (FIFO) Packets are transmitted in the order of their arrival Advantage: –Very simple to implement Disadvantage:
CCNA 3 Week 4 Switching Concepts. Copyright © 2005 University of Bolton Introduction Lan design has moved away from using shared media, hubs and repeaters.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
March 29 Scheduling ?. What is Packet Scheduling? Decide when and what packet to send on output link 1 2 Scheduler flow 1 flow 2 flow n Buffer management.
CS640: Introduction to Computer Networks Aditya Akella Lecture 20 - Queuing and Basics of QoS.
Nick McKeown Spring 2012 Lecture 2,3 Output Queueing EE384x Packet Switch Architectures.
Efficient Cache Structures of IP Routers to Provide Policy-Based Services Graduate School of Engineering Osaka City University
STORE AND FORWARD & CUT THROUGH FORWARD Switches can use different forwarding techniques— two of these are store-and-forward switching and cut-through.
LAN Switching Concepts. Overview Ethernet networks used to be built using repeaters. When the performance of these networks began to suffer because too.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
File Systems cs550 Operating Systems David Monismith.
Chapter 4 Version 1 Virtual LANs. Introduction By default, switches forward broadcasts, this means that all segments connected to a switch are in one.
CISCO NETWORKING ACADEMY Chabot College ELEC Ethernet Switches.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Queue Manager and Scheduler on Intel IXP John DeHart Amy Freestone Fred Kuhns Sailesh Kumar.
1 Fair Queuing Hamed Khanmirza Principles of Network University of Tehran.
CCNA3 Module 4 Brierley Module 4. CCNA3 Module 4 Brierley Topics LAN congestion and its effect on network performance Advantages of LAN segmentation in.
CS/CoE 536 : Lockwood 1 CS/CoE 536 Reconfigurable System On Chip Design Lecture 11 : Priority and Per-Flow Queuing in Machine Problem 3 (Revision 2) Washington.
Internet Flow By: Terry Hernandez. Getting from the customers computer onto the internet Internet Browser
1 Kyung Hee University Chapter 11 User Datagram Protocol.
CS/CoE 536 : Lockwood 1 CS/CoE 536 Reconfigurable System On Chip Design Lecture 10 : MP3 Working Draft Washington University Fall 2002
Pertemuan 7 Introduction to LAN Switching and Switch Operation
Lec # 25 Computer Network Muhammad Waseem Iqbal. Learn about the Internetworking Devices – Repeaters – Hubs – Switches – Bridges – Routers.
Youngstown State University Cisco Regional Academy
Chapter 11 User Datagram Protocol
High Rate Event Building with Gigabit Ethernet
Chapter 4 Network Layer All material copyright
Instructor Materials Chapter 5: Ethernet
The network-on-chip protocol
Congestion Control, Quality of Service, and Internetworking
Chapter 9 Ethernet Part II
Chapter 6 Queuing Disciplines
Virtual LANs.
Variations of Weighted Fair Queueing
Module 8: Ethernet Switching
EE384x: Packet Switch Architectures
Variations of Weighted Fair Queueing
Network Simulation NET441
Advanced Computer Networks
Packet Scheduling in Linux
Introduction to Packet Scheduling
Introduction to Packet Scheduling
Presentation transcript:

1 A Deficit Round Robin 20MB/s Layer 2 Switch Muraleedhara Navada Francois Labonte

2 Fairness in Switches How to provide fair bandwidth allocation at output link ? –Simple FIFO favors greedy flow Separate flows into FIFOs at output –Bit by Bit fair queuing –Weighted Fair Queuing allows different weight for flows –Packetized Weighted Fair Queuing (aka PGPS) calculates departure time for each packet Output Queued Switch Round-Robin bit by bit allocation

3 Deficit Round Robin Packetized Weighted Fair Queuing is complicated to implement Deficit Round Robin keeps track of credits for each flow –Flow sends according credits –Add credits according to weight –Essentially PWFQ at coarser level Credits Credits Credits Time

4 NetFPGA System 8 Port 10MB/s duplex ethernet Control FPGA (CFPGA) handles physical interface (MAC) Our design targets both the User FPGAs (UFPGA) CFPGA UFPGA1 UFPGA0 1MB SRAM 10MB/s Ethernet

5 Design Considerations 4 MACs behind each port (8) Each flow is a unique Source Address – Destination Address pair –~1024 flows Split across FPGAs –Each UFPGAs read incoming packets from different ports(0-3 and 4-7) –tradeoff between memory storage and fairness across all flows

6 Memory Buffer Allocation Static Partitioning of 1MB SRAM across 512 flows gives 2kbytes per flow < 2 max size packets Need more dynamic allocation –Segments: smaller size means less fragmentation, but more pointer and list handling overhead 128 bytes was chosen –Keep free segments list –Save on-chip only pointer to head and tail of each flow P4 P5 P6 P1 P2 P3

7 MAC address Learning Instead of telling which MAC addresses belong to which port Learn them from the source address –Note that our split FPGA design (reading from different ports) require them to communicate the MACs learned between them When destination MAC is not learned yet, broadcast (send to all other ports). So MAC learning implies broadcast capability

8 Read Operation Master Control Packet Memory Manager MAC Learning Flow Assignment DRR Engine Control Handler 1 MB SRAM CFPGA Interface DA, SA Flow ID Flow Tail Length, ptr Read, port Share SA

9 Write Operation Master Control Packet Memory Manager MAC Learning Flow Assignment DRR Engine Control Handler 1 MB SRAM CFPGA Interface Head, length Next head, length, latency Write, port Port REQ Port GNT Data Ready

10 DRR Engine How to handle 512 flows and stay work conserving: –Only one flow active at any time –DRR allocation happens on dequeuing –Fifos contain the next flow to be serviced for each port Statistics per flow –Weight –Latency –Byte sent –Packet sent –Packets active FLOW data 512 x 160bits SRAM Port 0 FIFOPort 1 FIFOPort 2 FIFOPort 3 FIFOPort 4 FIFOPort 5 FIFOPort 6 FIFOPort 7 FIFO

11 Conclusion A Deficit Round Robin Switch with 1k flows has been implemented Provides dynamic memory buffer allocation, MAC learning and broadcast Parallel design split across 2 chips Gathers statistics on flows