Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz

Similar presentations


Presentation on theme: "1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz"— Presentation transcript:

1 1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz E-mail: stchuang@stanford.edu Optical Router Project: http://klamath.stanford.edu/or/

2 2 Outline  Load-Balancing Background  Mis-sequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

3 3 Arbitration 160Gb/s Switch Fabric Line termination IP packet processing Packet buffering Line termination IP packet processing Packet buffering 160 Gb/s 160 Gb/s Electronic Linecard #1 Electronic Linecard #625 Request Grant (100Tb/s = 625 * 160Gb/s) 100Tb/s router

4 4 Load-Balanced Switch External Outputs Internal Inputs 1 N External Inputs Load-balancing cyclic shift Switching cyclic shift 1 N 1 N 1 1 2 2

5 5 160 Gbps Linecard Fixed-size Packets Reassembly Segmentation Lookup/ Processing R 1 N 2 VOQs Intermediate Input Block Load-balancing Switching Input Block Output Block R R R R R

6 6 Outline  Load-Balancing Background  Mis-sequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

7 7 Problem: Unbounded Mis-sequencing External Outputs Internal Inputs 1 N External Inputs Spanning Set of Permutations 1 N 1 N 1 1 2 2

8 8 Preventing Mis-sequencing  Uniform Frame Spreading:  Group cells by frames of N cells each (frame building)  Spread each frame across all middle linecards  Each middle stage receives the same type of packets => has the same queue occupancy state 111 N Middle stage NN 1N 1 N N 1 N 1

9 9 Outline  Load-Balancing Background  Missequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

10 10 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly

11 11 Technology Assumptions in 2005 DRAM Technology Access Time ~ 40 ns Size ~ 1 Gbits Memory Bandwidth ~ 16 Gbps (16 data pins) On-chip SRAM Technology Access Time ~ 2.5 ns Size ~ 64 Mbits Serial Link Technology Bandwidth ~ 10 Gb/s >100 serial links per chip

12 12 First Stage Segmentation 1 2 N R variable-size packets 128-byte cells 16-bytes 1 2 N 1 2 N 1 2 N 108-127 16-31 0-15 R/8 16-bytes R/8 Frame Building 108-127 16-31 0-15

13 13 Segmentation Chip (1st stage) Segmentation 1 2 N R variable-size packets 128-byte cells R/8  Incoming: 16x10 Gb/s  Outgoing: 8x2x10 Gb/s  On-chip Memory: N x 1500 bytes = 7.2 Mbits 3.2ns SRAM 16-bytes 108-127 16-31 0-15

14 14 Frame Building Chip (1st stage)  Incoming: 2x10 Gb/s  Outgoing: 2x10 Gb/s  On-chip Memory: N^2 x 16 bytes = 48 Mbits 3.2ns SRAM 16-bytes 1 2 N 0-15 R/8 16-bytes 0-15 R/8 Frame Building

15 15 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly

16 16 Packet Buffering Problem Packet buffers for a 160Gb/s router linecard Buffer Memory Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns 40Gbits Buffer Manager

17 17 Memory Technology  Use SRAM? + Fast enough random access time, but - Too low density to store 40Gbits of data.  Use DRAM? + High density means we can store data, but - Can’t meet random access time.

18 18 Can’t we just use lots of DRAMs in parallel? Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager Buffer Memory Read/write 1280B every 32ns 1280B Buffer Memory Buffer Memory Buffer Memory Buffer Memory 0-127128-255…1152-1279………………

19 19 128B Works fine if there is only one FIFO Write Rate, R One 128B packet every 6.4ns Read Rate, R One 128B packet every 6.4ns Buffer Manager 1280B Buffer Memory 1280B 128B 1280B 0-127128-255…1152-1279……………… 128B

20 20 In practice, buffer holds many FIFOs 1280B 1 2 Q e.g.  In an IP Router, Q might be 200.  In an ATM switch, Q might be 10 6. Write Rate, R One 128B packet every 6.4ns Read Rate, R One 1280B packet every 6.4ns Buffer Manager 1280B 320B ?B 320B 1280B ?B How can we write multiple packets into different queues? 0-127128-255…1152-1279………………

21 21 Arriving Packets R Arbiter or Scheduler Requests Departing Packets R 12 1 Q 2 1 2 34 34 5 123456 Small head SRAM cache for FIFO heads SRAM Hybrid Memory Hierarchy Large DRAM memory holds the body of FIFOs 5768109 798 11 12141315 5052515354 868887899190 8284838586 929493 95 68791110 1 Q 2 Writing b bytes Reading b bytes cache for FIFO tails 55 56 9697 87 88 57585960 899091 1 Q 2 Small tail SRAM DRAM

22 22 SRAM/DRAM results  How much SRAM buffering, given:  DRAM Trc = 40ns  Write and read a 128-byte cell every 6.4ns  Let Q = 625, b = 2*40ns/6.4ns = 12.5  Two Options [Iyer]  Zero Latency Qb[2+lnQ] = 61k cells = 66 Mbits  Some Latency Q(b-1) = 7.5k cells = 7.5 Mbits

23 23 Outline  Load-Balancing Background  Missequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

24 24 Problem Statement Queue Manager 40 Gb DRAM 160 Gb/s One 128B cell every 6.4ns One 128B cell every 6.4ns Write Rate, R Read Rate, R

25 25 Second Stage R/8 Main Buffering 1 2 N R/N 1 2 N 1 2 N R/8 16-bytes 108-127 16-31 0-15 16-bytes 108-127 16-31 0-15

26 26 Queue Manager Chip (2nd stage)  Incoming: 2x10 Gb/s  Outgoing: 2x10 Gb/s  35 pins/DRAM x 5 DRAMs = 175 pins  SRAM/DRAM Memory: Q(b-1) = 2.8 Mbits 3.2ns SRAM  SRAM linked list = 1 Mbit 3.2ns SRAM 16-bytes 0-15 R/8 16-bytes 0-15 R/8 Main Buffering 1 2 N R/N 5 x 1Gb DRAM R/4

27 27 Outline  Load-Balancing Background  Missequencing Problem  Datapath Architecture  First stage - Segmentation  Second stage – Main Buffering  Third stage - Reassembly

28 28 Three stages on a linecard Segmentation/ Frame Building 1st stage 1 2 N Main Buffering 2nd stage 1 2 N R/N RRRR 3rd stage 1 2 N RR Reassembly

29 29 Third stage Reassembly 1 2 N R variable-size packets R/8  Incoming: 8x2x10 Gb/s  Outgoing: 16x10 Gb/s  On-chip Memory: N x 1500 bytes = 7.2 Mbits 3.2ns SRAM 16-bytes 108-127 16-31 0-15

30 30  1st stage  1 segmentation chip  8 frame building chips  2nd stage  8 queue manager chips  40 1 Gb DRAMs  3rd stage  1 reassembly chip  Total chip count  18 ASIC chips  40 1 Gb DRAMs Linecard Datapath Requirements


Download ppt "1 OR Project Group II: Packet Buffer Proposal Da Chuang, Isaac Keslassy, Sundar Iyer, Greg Watson, Nick McKeown, Mark Horowitz"

Similar presentations


Ads by Google