Presentation on theme: "EE384Y: Packet Switch Architectures"— Presentation transcript:
1EE384Y: Packet Switch Architectures Part IISizing Router Buffers(Recent work by Guido Appenzeller)Nick McKeownProfessor of Electrical Engineeringand Computer Science, Stanford University
2How much Buffer does a Router need? Universally applied rule-of-thumb:A router needs a buffer size:2T is the round-trip propagation time (or just 250ms)C is the capacity of the outgoing linkBackgroundMandated in backbone and edge routers.Appears in RFPs and IETF architectural guidelines.Has major consequences for router design.Comes from dynamics of TCP congestion control.Villamizar and Song: “High Performance TCP in ANSNET”, CCR, 1994.Based on 2 to 16 TCP flows at speeds of up to 40 Mb/s.
3Example 10Gb/s linecard or router Memory technologies Requires 300Mbytes of buffering.Read and write new packet every 32ns.Memory technologiesSRAM: require 80 devices, 1kW, $2000.DRAM: require 4 devices, but too slow.Problem gets harder at 40Gb/sHence RLDRAM, FCRAM, etc.
4TCP TCP adapts to congestion Sender sends packets, receiver sends ACKsSending rate is controlled by Window WAt any time, only W unacknowledged packets may be outstandingW is adjusted for each packet (in CA mode):If ACK received: W = W+1/W (W=W+1 for each W packets)If packet is lost: W = W/2 (W halved in case of loss)The sending rate of TCP is:
5For every W ACKs received, Single TCP Flow Router with large enough buffers for full link utilizationtWindow sizeBuffer size and RTTFor every W ACKs received,send W+1 packetsBSourceDestC’ > CC
8Buffer = Rule-of-thumb Interval magnifiedon next slide
9Microscopic TCP Behavior When sender pauses, buffer drains one RTTDrop
10Origin of rule-of-thumb Before and after reducing window size, the sending rate of the TCP sender is the sameInserting the rate equation we getThe RTT is part transmission delay T and part queuing delay B/C . We know that after reducing the window, the queueing delay is zero.
11Rule-of-thumb Rule-of-thumb makes sense for one flow Typical backbone link has > 20,000 flowsDoes the rule-of-thumb still hold?Answer:If flows are perfectly synchronized, then Yes.If flows are desynchronized then No.
15If flows are not synchronized Aggregate window has less variationTherefore buffer occupancy has less variationThe more flows, the smaller the variationRule-of-thumb does not hold.
16If flows are not synchronized ProbabilityDistributionBBuffer Size
17Quantitative ModelModel congestion window of a flow as random variablemodel aswhereFor many de-synchronized flowsWe assume congestions windows are independentAll congestion windows have the same probability distributionNow central limit theorem gives us queue length distribution
20Small buffers help short flows Average flow completion times of 14 packet flows that share a congested bottleneck link with long-lived flows.
21Experiments with backbone router GSR 12000, OC3 Line Card TCPFlowsRouter BufferLink UtilizationPktsRAMModelSimExp1000.5 x1 x2 x3 x641292583871Mb2Mb4Mb8Mb96.9%99.9%100%94.7%99.3%99.8%94.9%98.1%99.7%40032128192512kb99.2%99.5%Thanks: Experiments conducted by Paul Barford and Joel Sommers, U of Wisconsin
22What about Short Flows?So far we assumed long flows in congestion avoidance mode.What if traffic is mainly short flows in slow-start?Answer: Behavior is different, butIn mixes of flows, long flows drive buffer requirementsRequired buffer for short flows is independent of line speed and RTT (same for 1Mbit/s or 40 Gbit/s)
23A single, short-lived TCP flow Flow length 62 packets, RTT ~140 ms 32Flow Completion Time (FCT)1684fin ack receivedsyn2RTT
24Modelling TCP Flows vs. independent bursts Inter-Burst Arrival Time is greater than buffer sizeTherefore, we assume bursts are independent.Poisson arrivals of flowsArrivals of length Lflow (the flow length in packets)Poisson arrivals of burstsFour different poisson arrival processes of lengths 2,4,...
25The M/G/1 Model TCP traffic is modelled as an M/G/1 arrival process: poisson arrivals of jobswith an arrival rate ofAverage queue length in jobs is:This gives us an average queue length in packets ofLet's see if this works in practice...
27Queue DistributionTo determine the required buffer, we need the queue distribution.Or at least the tail end of the queue distributionBuffer BQPacket LossP(Q = x)For M/G/1 queues there is no general solution for the queue distribution.We did two things (details are in the paper):Use M/G/1 processor sharing model (bad)Use Frank Kelly's effective bandwidth (good)
28In Summary Buffer size is dictated by long TCP flows. 10Gb/s linecard with 200,000 x 56kb/s flowsRule-of-thumb: Buffer = 2.5GbitsRequires external, slow DRAMBecomes: Buffer = 6MbitsCan use on-chip, fast SRAMCompletion time halved for short-flows40Gb/s linecard with 40,000 x 1Mb/s flowsRule-of-thumb: Buffer = 10GbitsBecomes: Buffer = 50Mbits