Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scaling Internet Routers Using Optics Isaac Keslassy, Shang-Tse Da Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department.

Similar presentations


Presentation on theme: "Scaling Internet Routers Using Optics Isaac Keslassy, Shang-Tse Da Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department."— Presentation transcript:

1 Scaling Internet Routers Using Optics Isaac Keslassy, Shang-Tse Da Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department of Electrical Engineering Stanford University http://yuba/~keslassy/papers/

2 Backbone router capacity 1Tb/s 1Gb/s 10Gb/s 100Gb/s Router capacity per rack 2x every 18 months

3 Backbone router capacity 1Tb/s 1Gb/s 10Gb/s 100Gb/s Router capacity per rack 2x every 18 months Traffic 2x every year

4 Extrapolating 1Tb/s Router capacity 2x every 18 months Traffic 2x every year 100Tb/s 2015: 16x disparity

5 Consequence  Unless something changes, operators will need:  16 times as many routers, consuming  16 times as much space,  256 times the power,  Costing 100 times as much.  Actually need more than that…

6 Stanford 100Tb/s Internet Router Goal: Study scalability  Challenging, but not impossible  Two orders of magnitude faster than deployed routers  We will build components to show feasibility 40Gb/s OpticalSwitch Line termination IP packet processing Packet buffering Line termination IP packet processing Packet buffering Electronic Linecard #1 Electronic Linecard #1 Electronic Linecard #625 Electronic Linecard #625 160- 320Gb/s 160Gb/s 160- 320Gb/s 100Tb/s = 640 * 160Gb/s

7 Throughput Guarantees  Operators increasingly demand throughput guarantees:  To maximize use of expensive long-haul links  For predictability and planning  Despite lots of effort and theory, no commercial router today has a throughput guarantee.

8 Requirements of our router  100Tb/s capacity  100% throughput for all traffic  Must work with any set of linecards present  Use technology available within 3 years  Conform to RFC 1812

9 What limits router capacity? Approximate power consumption per rack Power density is the limiting factor today

10 Crossbar Linecards Switch Linecards Trend: Multi-rack routers Reduces power density

11 Alcatel 7670 RSP Juniper TX8/T640 TX8 Chiaro Avici TSR

12 Limits to scaling  Overall power is dominated by linecards  Sheer number  Optical WAN components  Per packet processing and buffering.  But power density is dominated by switch fabric

13 Trend: Multi-rack routers Reduces power density Switch Linecards Limit today ~2.5Tb/s  Electronics  Scheduler scales <2x every 18 months  Opto-electronic conversion

14 In Out WAN Linecard In WAN Multi-rack routers Out Switch fabric

15 Question  Instead, can we use an optical fabric at 100Tb/s with 100% throughput?  Conventional answer: No.  Need to reconfigure switch too often  100% throughput requires complex electronic scheduler.

16 Outline  How to guarantee 100% throughput?  How to eliminate the scheduler?  How to use an optical switch fabric?  How to make it scalable and practical?

17 In Out R R R R R R Router capacity = NR Switch capacity = N 2 R 100% Throughput ? ? ? ? ? ? ? ? ? R R R R R R R R R R R R R

18 R In Out R R R R R R/N If traffic is uniform R R

19 Real traffic is not uniform R In Out R R R R R R/N R R R R R R R R R ?

20 Out R R R R/N Two-stage load-balancing switch Load-balancing stageSwitching stage In Out R R R R/N R R R 100% throughput for weakly mixing, stochastic traffic. [C.-S. Chang, Valiant]

21 Out R R R R/N In R R R R/N 3 3 1 2 3 3 3 3 3

22 Out R R R R/N In R R R R/N 3 3 1 2 3 3 3 3 3

23 Chang’s load-balanced switch Good properties 1. 100% throughput for broad class of traffic 1. No scheduler needed  Scalable

24 Chang’s load-balanced switch Bad properties FOFF: Load-balancing algorithm  Packet sequence maintained  No pathological patterns  100% throughput - always  Delay within bound of ideal  (See paper for details) FOFF: Load-balancing algorithm  Packet sequence maintained  No pathological patterns  100% throughput - always  Delay within bound of ideal  (See paper for details) 1. Packet mis-sequencing 2. Pathological traffic patterns  Throughput 1/N-th of capacity 3. Uses two switch fabrics  Hard to package 4. Doesn’t work with some linecards missing  Impractical

25 In Out R R R R R R 2R/N Single Mesh Switch One linecard

26 In R R R Out Backplane R R R Packaging 2R/N R/N

27 Many fabric options Options Space: Full uniform mesh Time: Round-robin crossbar Wavelength: Static WDM Any permutation network C 1, C 2, …, C N C1C1 C2C2 C3C3 CNCN In Out In Out In Out In Out N channels each at rate 2R/N

28 In Out In Out In Out In Out Static WDM switching Array Waveguide Router (AWGR) Passive and Almost Zero Power A B C D A, B, C, D A, A, A, A B, B, B, B C, C, C, C D, D, D, D 4 WDM channels, each at rate 2R/N

29 R WDM 1 N      R Out WDM 1 N 1 N R R      2 R R 4 2 1 Linecard dataflow WDM 1 N      2 2 2 2 2 2 2 2 2 2 2 2 1 1 3 3 1 1 11 1 1 1 1 1 1 1 1 RR 3 In 1 1 1 1 1 1 1 1

30 Problems of scale  For N < 64, WDM is a good solution.  We want N = 640.  Need to decompose.

31 Decomposing the mesh 2R/8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8

32 Decomposing the mesh 2R/4 2R/8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 TDM WDM

33 When N is too large Decompose into groups (or racks) 1, 2, …, G 1 Array Waveguide Router (AWGR) 2L 2R 12L Group/Rack 1 Group/Rack G 1 G 1, 2, …, G

34 When a linecard is missing  Each linecard spreads its data equally over every other linecard.  Problem: If one is missing, or failed, then the spreading no longer works.

35 When a linecard fails In Out R R R 2R/3 2R/3 + 2R/3 = 4R/3 In R R R 2R/3 + 2R/6 2R/3 + 2R/6 + 2R/3 + 2R/6 = 2R 2R/3 + 2R/6 Solution: 1.Move light beams  Replace AWGR with MEMS switch.  Reconfigure when linecard added, removed or fails. 2.Finer channel granularity  Multiple paths.

36 Solution Use transparent MEMS switches 12L 2R 12L Group/Rack 1 Group/Rack G=40 MEMS Switch 1 G MEMS Switch 1 G MEMS Switch 1 G Theorems: 1. Require L+G - 1 MEMS switches 2. Polynomial time reconfiguration algorithm MEMS switches reconfigured only when linecard added, removed or fails.

37 Challenges R WDM 1 G     G R Out WDM 1 G Pkt Switch 1 G R R     G 2 R=160Gb/s R 4 2 1 WDM 1 G     G Address Lookup 11 RR 3 In How to build a 250ms 160Gb/s buffer? Low-cost, low-power optoelectronic conversion?

38 What we are building Buffer Manager 90nm ASIC Buffer Manager 90nm ASIC 250ms DRAM 160Gb/s 320Gb/s Chip #1: 160Gb/s Packet Buffer CMOS ASIC 16 x 10Gb/s To LinecardsTo Optical Fabric Chip #2: 16 x 55 Opto-electronic crossbar 55 x 10Gb/s Optical source Optical Detector Optical Modulator

39 100Tb/s Load-Balanced Router L = 16 160Gb/s linecards Linecard Rack G = 40 L = 16 160Gb/s linecards Linecard Rack 1 L = 16 160Gb/s linecards 5556 12 40 x 40 MEMS Switch Rack < 100W


Download ppt "Scaling Internet Routers Using Optics Isaac Keslassy, Shang-Tse Da Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, Nick McKeown Department."

Similar presentations


Ads by Google