Presentation is loading. Please wait.

Presentation is loading. Please wait.

TCP Congestion Control at the Network Edge

Similar presentations


Presentation on theme: "TCP Congestion Control at the Network Edge"— Presentation transcript:

1 TCP Congestion Control at the Network Edge
Jennifer Rexford Fall 2016 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks

2 Original TCP Design Internet

3 Wireless and Data-Center Networks
Internet wireless network

4 Wireless Networks

5 TCP Design Setting Relatively low packet loss
E.g., hopefully less than 1% Okay to retransmit lost packets from the sender Loss is caused primarily by congestion Use loss as an implicit signal of congestion … and reduce the sending rate Relatively stable round-trip times Use RTT estimate in retransmission timer End-points are always on Stable end-point IP addresses Use IP addresses as end-point identifiers

6 Problems in Wireless Networks
Limited bandwidth High latencies High bit-error rates Temporary disconnections Slow handoffs Mobile device disconnects to save energy, bearers, etc. Internet

7 Link-Level Retransmission
Retransmit over the wireless link Hide packet losses from end-to-end … by retransmitting lost packets on wireless link Works for any transport protocol

8 Split Connection Two TCP connections Other optimizations
Between fixed host and the base station Between base and the mobile device Other optimizations Compression, just-in-time delivery, etc.

9 Burst Optimization Radio wakeup is expensive Burst optimization
Establish a bearer Use battery and signaling resources Burst optimization Send bigger chunks less often … to allow the mobile device to go to idle state

10 Lossless Handover Mobile moves from one base station to another
Packets in flight still arrive at the old base station … and could lead to bursty loss (and TCP timeout) Old base station can buffer packets Send buffered packets to the new base station Internet

11 Freezing the Connection
Mobile device can predict temporary disconnection E.g., fading, handoff Mobile can ask the fixed host to stop sending Advertise a receive window of 0 Benefits Avoids wasted transmission of data Avoid loss that triggers timeouts, decrease in cwnd, etc.

12 Data-Center Networks

13 Modular Network Topology
Containers Racks Multiple servers Top-of-rack switches

14 Tree-Like Topologies . . . Many equal-cost paths
CR CR AR AR AR AR S S S S . . . S S S S S S S S A A A A A A A A A A A A Many equal-cost paths Small round-trip times (e.g., < 250 microseconds)

15 Commodity Switches Low-cost switches Simple memory architecture
Especially for top-of-rack switches Simple memory architecture Small packet-buffer space Shared buffer over all input ports Simple drop-tail queues

16 Multi-Tier Applications
Front end Server Aggregator … … Aggregator Aggregator Aggregator Worker Worker Worker Worker Worker 16

17 Application Mix Partition-aggregate workflow Diverse mix of traffic
Multiple workers working in parallel Straggler slows down the entire system Many workers send response at the same time Diverse mix of traffic Low latency for short flows High throughput for long flows Multi-tenancy Many tenants sharing the same platform Running network to high levels of utilization Small number of large flows on links

18 TCP Incast Problem Multiple workers transmitting to one aggregator
Many flows traversing the same link Burst of packets sent at (nearly) the same time … into a relatively small switch memory Leading to high packet loss Some results are slow to arrive May be excluded from the final results Developer software changes Limit the size of worker responses Randomize the sending time for responses

19 Queue Buildup Mix of long and short flows
Long flows fill up the buffers in switches … causing queuing delay (and loss) for the short flows E.g., queuing delay of 1-14 milliseconds Large relative to propagation delay E.g., 100 microseconds intra-rack E.g., 250 microseconds inter-rack Leading to RTT variance and big throughput drop Shared switch buffers Short flows on one port … affected by long flows on other ports

20 TCP Outcast Problem Mix of flows at two different input ports
Many inter-rack flows Few intra-rack flows Destined for same output Burst of packet arrivals Arriving on one input port Causing bursty loss for the other Harmful to the intra-rack flows Lose multiple packets Loss detected by timeout Irony: worse throughput despite lower RTT! AR AR S S S S S S A A A A A A

21 Delayed Acknowledgments
Sending ACKs can be expensive E.g., send 40-byte ACK packet for each data packet Delay ACKs reduce the overhead Receiver waits before sending the ACK … in the hope of piggybacking the ACK on a response Delayed-ACK mechanism Set a timer when the data arrives (e.g., 200 msec) Piggyback the ACK or send ACK for every other packet … or send an ACK after the timer expires Timeout for delayed ACK is an eternity!! Disable delayed ACKs, or shorten the timer

22 Data-Center TCP (DCTCP)
Key observation TCP reacts to the presence of congestion … not to the extent of congestion Measuring extent of congestion Mark packets when buffer exceeds a threshold Reacting to congestion Reduce cwnd in proportion to fraction of marked packets Benefits React early, as queue starts to build Prevent harm to packets on other ports Get workers to reduce sending rate early

23 Poor Multi-Path Load Balancing
Multiple shortest paths between pairs of hosts Spread the load over multiple paths Equal-cost multipath Round robin Hash-based Uneven load Elephant flows congest some paths … while other paths are lightly loaded Reducing congestion Careful routing of elephant flows

24 Discussion DC-TCP paper

25 Questions What problem does the paper solve, and why?
What are novel/interesting aspects of the solution? What were the strengths about the paper? What were the weaknesses of the paper? What follow-on work would you recommend to build on, or extend, or strengthen, the work?

26 Queue Management Background

27 Router Processor Switching Fabric control plane data plane Line card

28 Line Cards (Interface Cards, Adaptors)
Packet handling Packet forwarding Buffer management Link scheduling Packet filtering Rate limiting Packet marking Measurement to/from link Receive lookup Transmit to/from switch

29 Packet Switching and Forwarding
Link 1, ingress Link 1, egress “4” Choose Egress Link 2 Link 2, ingress Choose Egress Link 2, egress R1 “4” Link 1 Link 3 Link 3, ingress Choose Egress Link 3, egress Link 4 Link 4, ingress Choose Egress Link 4, egress

30 Queue Management Issues
Scheduling discipline Which packet to send? Some notion of fairness? Priority? Drop policy When should you discard a packet? Which packet to discard? Goal: balance throughput and delay Huge buffers minimize drops, but add to queuing delay (thus higher RTT, longer slow start, …)

31 FIFO Scheduling and Drop-Tail
Access to the bandwidth: first-in first-out queue Packets only differentiated when they arrive Access to the buffer space: drop-tail queuing If the queue is full, drop the incoming packet

32 Bursty Loss From Drop-Tail Queuing
TCP depends on packet loss Packet loss is indication of congestion TCP additive increase drives network into loss Drop-tail leads to bursty loss Congested link: many packets encounter full queue Synchronization: many connections lose packets at once

33 Slow Feedback from Drop Tail
Feedback comes when buffer is completely full … even though the buffer has been filling for a while Plus, the filling buffer is increasing RTT … making detection even slower Better to give early feedback Get 1-2 connections to slow down before it’s too late!

34 Random Early Detection (RED)
Router notices that queue is getting full … and randomly drops packets to signal congestion Packet drop probability Drop probability increases as queue length increases Else, set drop probability f(avg queue length) Probability Drop Average Queue Length

35 Properties of RED Drops packets before queue is full
In the hope of reducing the rates of some flows Drops packet in proportion to each flow’s rate High-rate flows selected more often Drops are spaced out in time Helps desynchronize the TCP senders Tolerant of burstiness in the traffic By basing the decisions on average queue length

36 Problems With RED Hard to get tunable parameters just right
How early to start dropping packets? What slope for increase in drop probability? What time scale for averaging queue length? RED has mixed adoption in practice If parameters aren’t set right, RED doesn’t help Many other variations in research community Names like “Blue” (self-tuning), “FRED”…

37 Feedback: From Loss to Notification
Early dropping of packets Good: gives early feedback Bad: has to drop the packet to give the feedback Explicit Congestion Notification (ECN) Router marks the packet with an ECN bit Sending host interprets as a sign of congestion Requires participation of hosts and the routers


Download ppt "TCP Congestion Control at the Network Edge"

Similar presentations


Ads by Google