Presentation is loading. Please wait.

Presentation is loading. Please wait.

On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University.

Similar presentations


Presentation on theme: "On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University."— Presentation transcript:

1 On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

2 In Defense of TCP Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University

3 3 Overview TCP has done well for the past 20 yrs –It is continuing to do well However, some performance problems have been identified –TCP needs long buffers to keep links utilized –It doesn’t perform well in large BWxDelay links It is oscillatory, and it is sluggish –TCP takes too long to process short flows Note: we’re not addressing TCP over wireless We revisit some of the issues and find –Either we have demanded too much –Or, there are very close relatives to TCP that satisfy our demands

4 4 Some background I’ve recently become familiar with congestion control in two LAN networks –Fibre Channel (for Storage Area Networks) –Ethernet (part of a standardization effort) These networks are much more severe than the Internet in terms of the operating condition, and yet they function quite alright! Let’s look at them briefly

5 5 SAN: Fibre Channel Fibre Channel: A standardized protocol for SANs; main features –Packet switching –Typical topology: 3-5 hop networks –No packet drops! Buffer-to-buffer credits used to transport data –No end-to-end congestion control! Very wide deployment –The dominant technology for host-to-storage array data transfers No congestion-related or congestion collapse problems reported to date! Upon investigation, here’re some factors helping FC networks –Small file sizes (128KB, chopped as 64 2KB pkts, comparable to buffer size at the switches) –Light loads (30-40%) –Small topologies

6 Data Center Ethernet Involved in developing a congestion management algorithm in the IEEE 802.1 Data Center Bridging standards activity, for Data Center Ethernet –With Berk Atikoglu, Abdul Kabbani, Rong Pan and Mick Seaman Ethernet vs Internet (not an exhaustive list) 1.There is no end-to-end signaling in the Ethernet a la per-packet acks in the Internet So congestion must be signaled to the source by switches Not possible to know round trip time! Algorithm not automatically self-clocked (like TCP) 2.Links can be paused; i.e. packets may not be dropped 3.No sequence numbering of L2 packets 4.Sources do not start transmission gently (like TCP slow-start); they can potentially come on at the full line rate of 10Gbps 5.Ethernet switch buffers are much smaller than router buffers (100s of KBs vs 100s of MBs) 6.Most importantly, algorithm should be simple enough to be implemented completely in hardware

7 Summary We see a “hierarchy of harsh operating environments” –Low BWxDelay Internet, High BWxDelay Internet, Ethernet, Wireless, etc… Based on experience with Ethernet and Fibre Channel, I’m convinced that –TCP is operating in an environment where sufficient information and flexibility exists to obtain good performance; with minimal changes In the next part, we will illustrate the above claim –Since we’re not allowed to mention any schemes out there, we thought we’d invent a new one! Will consider high BWxDelay networks –With short buffers (10-20% of BWxDelay) –Small number of sources (e.g. 1 source) –Assume: ECN marking –Show that utilization can be as high as 100% –Short flows need not suffer large transfer times

8 8 Single Link Topology Recall –A single TCP source needs BWxDelay amount of buffering to run at the line rate –With shorter buffers TCP loses tpt dramatically –This hurts TCP in very large BWxDelay networks We consider a single long link to begin with –BWxDelay = 1000 pkts –Buffer used = 200 pkts –Marking probability AB 400Mbps 30msec RTT Queue occupancy Sampling probability 25%50%100% 1% 100%

9 TCP tpt with short buffers 78%

10 TCP tpt with short buffers: Varying number of sources

11 Improvement: Take One Clearly cutting the window by a factor of 2 is harmful –The source takes a long time to build its window back up So, let’s consider a “multibit TCP” which allows us to cut the window by smaller factors cwnd <-- cwnd(1- ECN/2 n+1 ) –E.g. with 6-bit TCP, smallest cut is by 127/128 Queue occupancy Sampling probability 25%50% 100 % 1% 100 % Queue occupancy Mark value 25%50%100% 1 2 n -1

12 Single-link: Window Size

13 Single-link: Queue Occupancy 1 source 2 sources

14 14 Multiple Link Topology Want to see if improvements persist as we go to larger networks Parking lot topology 0123 400Mbps 30msec RTT R1R2R3 R4 R5

15 Parking Lot Utilization: Single-hop Flows 0123 400Mbp s 30msec RTT R1R2R3 R4 R5 R1 R2 R3

16 Parking Lot Utilization: Multi-hop Flows R5: two-hop R4: three-hop

17 Summary In both the single link and the multiple link topology –Ensuring that TCP doesn’t always cut its window by 2 is a good idea Can we have this happen without using multiple bits?

18 Adaptive 1-bit TCP Source We came up with this over the weekend, so it really is part of this talk and not “our favorite algorithm” Source maintains an “average congestion seen” value AVE Updating AVE: simple exponential averaging –AVE <-- (AVE + ECN)/2 –Note: AVE is between 0 and 1 Using AVE: –cwnd <-- cwnd / (1 + AVE) –Decrease factor is between 1 and 2

19 Single-link: Window Size 100%

20 Single-link: Queue Occupancy 1 source 2 sources

21 Single-link: Utilization

22 22 Improving the transfer time for short flows Ran out of time for this one But, if we just have a starting window size of 10 pkts as opposed to 1 pkt, most short flows will complete during 1 RTT of the slow start phase

23 23 Conclusion The wide area Internet is quite a friendly environment compared to Ethernet, Fibre Channel and, certainly, Wireless Simple fixes exist (and are well-known) for high BWxDelay networks –Relationship with buffer size useful to understand –But, short buffers quite adequate –Fake watermark on buffer (e.g. at 50% of buffer size) helps reduce packet drops drastical Using fake watermark on router buffers enables using smaller buffers and reducing bursty drops

24 24 Long live TCP! (with facelifts)


Download ppt "On Horrible TCP Performance over Underwater Links Balaji Prabhakar Abdul Kabbani, Balaji Prabhakar Stanford University."

Similar presentations


Ads by Google