1 Growth in Router Capacity IPAM, Lake Arrowhead October 2003 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.

Slides:

Advertisements

Similar presentations

Router Internals CS 4251: Computer Networking II Nick Feamster Spring 2008.

Advertisements

Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.

1 Maintaining Packet Order in Two-Stage Switches Isaac Keslassy, Nick McKeown Stanford University.

1 Statistical Analysis of Packet Buffer Architectures Gireesh Shrimali, Isaac Keslassy, Nick McKeown

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 High Speed Router Design Shivkumar Kalyanaraman Rensselaer Polytechnic Institute

Router Architecture : Building high-performance routers Ian Pratt

Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.

Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

May 28th, 2002Nick McKeown 1 Scaling routers: Where do we go from here? HPSR, Kobe, Japan May 28 th, 2002 Nick McKeown Professor of Electrical Engineering.

Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University The Load-Balanced Router.

A Scalable Switch for Service Guarantees Bill Lin (University of California, San Diego) Isaac Keslassy (Technion, Israel)

Making Parallel Packet Switches Practical Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science,

1 Circuit Switching in the Core OpenArch April 5 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University

Scaling Internet Routers Using Optics Producing a 100TB/s Router Ashley Green and Brad Rosen February 16, 2004.

1 Architectural Results in the Optical Router Project Da Chuang, Isaac Keslassy, Nick McKeown High Performance Networking Group

1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz

Using Load-Balancing To Build High-Performance Routers Isaac Keslassy, Shang-Tse (Da) Chuang, Nick McKeown Stanford University.

048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion MSM.

048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion The.

048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scaling.

Scaling Internet Routers Using Optics Isaac Keslassy, Shang-Tse Da Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department.

1 Internet Routers Stochastics Network Seminar February 22 nd 2002 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.

EE 122: Router Design Kevin Lai September 25, 2002.

IEE, October 2001Nick McKeown1 High Performance Routers Slides originally by Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford.

048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Introduction.

Nick McKeown 1 Memory for High Performance Internet Routers Micron February 12 th 2003 Nick McKeown Professor of Electrical Engineering and Computer Science,

1 EE384Y: Packet Switch Architectures Part II Load-balanced Switches Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University.

1 Trend in the design and analysis of Internet Routers University of Pennsylvania March 17 th 2003 Nick McKeown Professor of Electrical Engineering and.

1 Achieving 100% throughput Where we are in the course… 1. Switch model 2. Uniform traffic  Technique: Uniform schedule (easy) 3. Non-uniform traffic,

048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Maximal.

Analysis of a Memory Architecture for Fast Packet Buffers Sundar Iyer, Ramana Rao Kompella & Nick McKeown (sundaes,ramana, Departments.

August 20 th, A 2.5Tb/s LCS Switch Core Nick McKeown Costas Calamvokis Shang-tse Chuang Accelerating The Broadband Revolution P M C - S I E R R.

048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Scheduling.

Router Design (Nick Feamster) February 11, Today’s Lecture The design of big, fast routers Partridge et al., A 50 Gb/s IP Router Design constraints.

1 IP routers with memory that runs slower than the line rate Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford.

Load Balanced Birkhoff-von Neumann Switches

Nick McKeown CS244 Lecture 7 Valiant Load Balancing.

Professor Yashar Ganjali Department of Computer Science University of Toronto

Optics in Internet Routers Mark Horowitz, Nick McKeown, Olav Solgaard, David Miller Stanford University

INF5050 – Protocols and Routing in Internet (Friday ) Presented by Tor Skeie Subject: IP-router architecture.

Summary of switching theory Balaji Prabhakar Stanford University.

Advance Computer Networking L-8 Routers Acknowledgments: Lecture slides are from the graduate level Computer Networks course thought by Srinivasan Seshan.

Designing Packet Buffers for Internet Routers Friday, October 23, 2015 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford.

Winter 2006EE384x1 EE384x: Packet Switch Architectures I Parallel Packet Buffers Nick McKeown Professor of Electrical Engineering and Computer Science,

Applied research laboratory 1 Scaling Internet Routers Using Optics Isaac Keslassy, et al. Proceedings of SIGCOMM Slides:

Packet Forwarding. A router has several input/output lines. From an input line, it receives a packet. It will check the header of the packet to determine.

Nick McKeown1 Building Fast Packet Buffers From Slow Memory CIS Roundtable May 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,

1 Performance Guarantees for Internet Routers ISL Affiliates Meeting April 4 th 2002 Nick McKeown Professor of Electrical Engineering and Computer Science,

1 Router Design Bruce Davie with help from Hari Balakrishnan & Nick McKeown.

Stress Resistant Scheduling Algorithms for CIOQ Switches Prashanth Pappu Applied Research Laboratory Washington University in St Louis “Stress Resistant.

Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Challenges in Modern Multi-Tera- bit Class Switch Design.

Winter 2006EE384x Handout 11 EE384x: Packet Switch Architectures Handout 1: Logistics and Introduction Professor Balaji Prabhakar

Opticomm 2001Nick McKeown1 Do Optics Belong in Internet Core Routers? Keynote, Opticomm 2001 Denver, Colorado Nick McKeown Professor of Electrical Engineering.

IEE, October 2001Nick McKeown1 High Performance Routers IEE, London October 18 th, 2001 Nick McKeown Professor of Electrical Engineering and Computer Science,

Buffered Crossbars With Performance Guarantees Shang-Tse (Da) Chuang Cisco Systems EE384Y Thursday, April 27, 2006.

SNRC Meeting June 7 th, Crossbar Switch Scheduling Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University

1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown

1 How scalable is the capacity of (electronic) IP routers? Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University

The Fork-Join Router Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University

Techniques for Fast Packet Buffers Sundar Iyer, Nick McKeown Departments of Electrical Engineering & Computer Science, Stanford.

Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.

1 Building big router from lots of little routers Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University.

scheduling for local-area networks”

Weren’t routers supposed

Packet Forwarding.

Addressing: Router Design

Parallelism in Network Systems Joint work with Sundar Iyer

Advance Computer Networking

Techniques and problems for

Techniques for Fast Packet Buffers

Presentation transcript:

1 Growth in Router Capacity IPAM, Lake Arrowhead October 2003 Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University

2 Generic Router Architecture Lookup IP Address Update Header Header Processing DataHdrDataHdr ~1M prefixes Off-chip DRAM Address Table Address Table IP AddressNext Hop Queue Packet Buffer Memory Buffer Memory ~1M packets ( ms) Off-chip DRAM

3 Generic Router Architecture Lookup IP Address Update Header Header Processing Address Table Address Table Lookup IP Address Update Header Header Processing Address Table Address Table Lookup IP Address Update Header Header Processing Address Table Address Table Buffer Manager Buffer Memory Buffer Memory Buffer Manager Buffer Memory Buffer Memory Buffer Manager Buffer Memory Buffer Memory

4 What a High Performance Router Looks Like Cisco GSR Juniper M160 6ft 19” 2ft Capacity: 160Gb/s Power: 4.2kW 3ft 2.5ft 19” Capacity: 80Gb/s Power: 2.6kW

5 Backbone router capacity 1Tb/s 1Gb/s 10Gb/s 100Gb/s Router capacity per rack 2x every 18 months

6 Backbone router capacity 1Tb/s 1Gb/s 10Gb/s 100Gb/s Router capacity per rack 2x every 18 months Traffic 2x every year

7 Extrapolating 1Tb/s Router capacity 2x every 18 months Traffic 2x every year 100Tb/s 2015: 16x disparity

8 Consequence  Unless something changes, operators will need:  16 times as many routers, consuming  16 times as much space,  256 times the power,  Costing 100 times as much.  Actually need more than that…

9 What limits router capacity? Approximate power consumption per rack Power density is the limiting factor today

10 Crossbar Linecards Switch Linecards Trend: Multi-rack routers Reduces power density

11 Alcatel 7670 RSP Juniper TX8/T640 TX8 Chiaro Avici TSR

12 Trend: Single POP routers  Very high capacity (10+Tb/s)  Line-rates T1 to OC768 Reasons:  Big multi-rack router more efficient than many single-rack routers,  Easier to manage fewer routers.

13 Router linecard Physical Layer Framing & Maintenance Packet Processing Buffer Mgmt & Scheduling Buffer Mgmt & Scheduling Buffer & State Memory Buffer & State Memory OC192c linecard  30M gates  2.5Gbits of memory  2-300W  1m 2  $25k cost, $100k price. Lookup Tables Optics Scheduler 40-55% of power in chip-to-chip serial links

14 What’s hard, what’s not  Linerate fowarding:  Linerate LPM was an issue for while.  Commercial TCAMs and algorithms available up to 100Gb/s.  1M prefixes fit in corner of 90nm ASIC.  2 32 addresses will fit in a $10 DRAM in 8 years  Packet buffering:  Not a problem up to about 10Gb/s; big problem above 10Gb/s.  More on this later…  Header processing:  For basic IPv4 operations: not a problem.  If we keep adding functions, it will be a problem.  More on this later…

15 What’s hard, what’s not (2)  Switching  If throughput doesn’t matter: Easy: Lots of multistage, distributed or load-balanced switch fabrics.  If throughput matters: Use crossbar, VOQs and centralized scheduler Or multistage fabric and lots of speedup.  If throughput guarantee is required: Maximal matching, VOQs and speedup of two [Dai & Prabhakar ‘00]; or Load-balanced 2-stage switch [Chang 01; Sigcomm 03].

16 What’s hard  Packet buffers above 10Gb/s  Extra processing on the datapath  Switching with throughput guarantees

17 Packet Buffering Problem Packet buffers for a 160Gb/s router linecard Buffer Memory Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns 40Gbits Buffer Manager Problem is solved if a memory can be (random) accessed every 3.2ns and store 40Gb of data Scheduler Requests

18 Memory Technology  Use SRAM? + Fast enough random access time, but - Too low density to store 40Gbits of data.  Use DRAM? + High density means we can store data, but - Can’t meet random access time.

19 Can’t we just use lots of DRAMs in parallel? Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager Buffer Memory Read/write packets in larger blocks 1280B Buffer Memory Buffer Memory Buffer Memory Buffer Memory Scheduler Requests

20 128B Works fine if there is only one FIFO Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager (on chip SRAM) 1280B Buffer Memory 1280B 128B 1280B … ……………… 128B Aggregate 1280B for the queue in fast SRAM and read and write to all DRAMs in parallel Scheduler Requests

21 In practice, buffer holds many FIFOs 1280B 1 2 Q e.g.  In an IP Router, Q might be 200.  In an ATM switch, Q might be Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager 1280B 320B ?B 320B 1280B ?B How can we write multiple packets into different queues? … ……………… Scheduler Requests

22 Buffer Manager Arriving Packets R Scheduler Requests Departing Packets R 12 1 Q Small head SRAM cache for FIFO heads (ASIC with on chip SRAM) Parallel Packet Buffer Hybrid Memory Hierarchy cache for FIFO tails Q 2 Small tail SRAM Large DRAM memory holds the body of FIFOs Q 2 Writing b bytes Reading b bytes DRAM b = degree of parallelism

23 Problem:  What is the minimum size of the SRAM needed so that every packet is available immediately within a fixed latency? Solutions:  Qb(2 +ln Q) bytes, for zero latency  Q(b – 1) bytes, for Q(b – 1) + 1 deep pipeline. Problem Examples: 1.160Gb/s line card, b =1280, Q =625: SRAM = 52Mbits 2.160Gb/s line card, b =1280, Q =625: SRAM =6.1Mbits, pipeline is 40ms.

24 Pipeline Latency, x SRAM Size Queue Length for Zero Latency Queue Length for Maximum Latency Discussion Q=1000, b = 10

25 Why it’s interesting  This is a problem faced by every linecard, network switch and network processor starting at 10Gb/s.  All commercial routers use an ad-hoc memory management algorithm with no guarantees.  We have the only (and optimal) solution that guarantees to work for all traffic patterns.

26 What’s hard  Packet buffers above 10Gb/s  Extra processing on the datapath  Switching with throughput guarantees

27 Recent trends DRAM Random Access Time 1.1x / 18months Moore’s Law 2x / 18 months Line Capacity 2x / 7 months User Traffic 2x / 12months

28 Packet processing gets harder time Instructions per arriving byte What we’d like: (more features) QoS, Multicast, Security, … What will happen

29 Packet processing gets harder Clock cycles per minimum length packet since 1996

30 What’s hard  Packet buffers above 10Gb/s  Extra processing on the datapath  Switching with throughput guarantees

31 Potted history 1. [Karol et al. 1987] Throughput limited to by head- of-line blocking for Bernoulli IID uniform traffic. 2. [Tamir 1989] Observed that with “Virtual Output Queues” (VOQs) Head-of-Line blocking is reduced and throughput goes up.

32 Potted history 3. [Anderson et al. 1993] Observed analogy to maximum size matching in a bipartite graph. 4. [M et al. 1995] (a) Maximum size match can not guarantee 100% throughput. (b) But maximum weight match can – O(N 3 ). 5. [Mekkittikul and M 1998] A carefully picked maximum size match can give 100% throughput. Matching O(N 2.5 )

33 Potted history Speedup 5. [Chuang, Goel et al. 1997] Precise emulation of a central shared memory switch is possible with a speedup of two and a “stable marriage” scheduling algorithm. 6. [Prabhakar and Dai 2000] 100% throughput possible for maximal matching with a speedup of two.

34 Potted history Newer approaches 7. [Tassiulas 1998] 100% throughput possible for simple randomized algorithm with memory. 8. [Giaccone et al. 2001] “Apsara” algorithms. 9. [Iyer and M 2000] Parallel switches can achieve 100% throughput and emulate an output queued switch. 10. [Chang et al. 2000, Keslassy et al. Sigcomm 2003] A 2- stage switch with no scheduler can give 100% throughput. 11. [Iyer, Zhang and M 2002] Distributed shared memory switches can emulate an output queued switch.

35 Basic Switch Model A 1 (n) S(n) N N L NN (n) A 1N (n) A 11 (n) L 11 (n) 11 A N (n) A NN (n) A N1 (n) D 1 (n) D N (n)

36 Some definitions of throughput

37 Scheduling algorithms to achieve 100% throughput 1. When traffic is uniform (Many algorithms…) 2. When traffic is non-uniform, but traffic matrix is known Technique: Birkhoff-von Neumann decomposition. 3. When matrix is not known. Technique: Lyapunov function. 4. When algorithm is pipelined, or information is incomplete. Technique: Lyapunov function. 5. When algorithm does not complete. Technique: Randomized algorithm. 6. When there is speedup. Technique: Fluid model. 7. When there is no algorithm. Technique: 2-stage load-balancing switch. Technique: Parallel Packet Switch.

38 Outline

39 Throughput results Theory: Practice: Input Queueing (IQ) Input Queueing (IQ) Input Queueing (IQ) Input Queueing (IQ) 58% [Karol, 1987] IQ + VOQ, Maximum weight matching IQ + VOQ, Maximum weight matching IQ + VOQ, Sub-maximal size matching e.g. PIM, iSLIP. IQ + VOQ, Sub-maximal size matching e.g. PIM, iSLIP. 100% [M et al., 1995] Different weight functions, incomplete information, pipelining. Different weight functions, incomplete information, pipelining. Randomized algorithms 100% [Tassiulas, 1998] 100% [Various] Various heuristics, distributed algorithms, and amounts of speedup Various heuristics, distributed algorithms, and amounts of speedup IQ + VOQ, Maximal size matching, Speedup of two. IQ + VOQ, Maximal size matching, Speedup of two. 100% [Dai & Prabhakar, 2000]

40 Trends in Switching  Fastest centralized scheduler with throughput guarantee: ~1Tb/s  Complexity scales O(n 2 )  Capacity grows <<2x every 18 months  Hence load-balanced switches

41 Stanford 100Tb/s Internet Router Goal: Study scalability  Challenging, but not impossible  Two orders of magnitude faster than deployed routers  We will build components to show feasibility 40Gb/s OpticalSwitch Line termination IP packet processing Packet buffering Line termination IP packet processing Packet buffering Electronic Linecard #1 Electronic Linecard #1 Electronic Linecard #625 Electronic Linecard # Gb/s 160Gb/s Gb/s 100Tb/s = 640 * 160Gb/s

42 Question  Can we use an optical fabric at 100Tb/s with 100% throughput?  Conventional answer: No.  Need to reconfigure switch too often  100% throughput requires complex electronic scheduler.

43 Out R R R R/N Two-stage load-balancing switch Load-balancing stageSwitching stage In Out R R R R/N R R R 100% throughput for weakly mixing, stochastic traffic. [C.-S. Chang, Valiant]

44 Out R R R R/N In R R R R/N

45 Out R R R R/N In R R R R/N

46 Chang’s load-balanced switch Good properties % throughput for broad class of traffic 2. No scheduler needed  Scalable

47 Chang’s load-balanced switch Bad properties FOFF: Load-balancing algorithm  Packet sequence maintained  No pathological patterns  100% throughput - always  Delay within bound of ideal FOFF: Load-balancing algorithm  Packet sequence maintained  No pathological patterns  100% throughput - always  Delay within bound of ideal 1. Packet mis-sequencing 2. Pathological traffic patterns  Throughput 1/N-th of capacity 3. Uses two switch fabrics  Hard to package 4. Doesn’t work with some linecards missing  Impractical

48 100Tb/s Load-Balanced Router L = Gb/s linecards Linecard Rack G = 40 L = Gb/s linecards Linecard Rack 1 L = Gb/s linecards x 40 MEMS Switch Rack < 100W

49 Summary of trends  Multi-rack routers  Single router POPs  No commercial router provides 100% throughput guarantee.  Address lookups  Not a problem to 160+Gb/s per linecard.  Packet buffering  Imperfect; loss of throughput above 10Gb/s.  Switching  Centralized schedulers up to about 1Tb/s  Load-balanced 2-stage switches with 100% throughput.