Department of Informatics Networks and Distributed Systems (ND) group Modularizing TCP with timers Michael Welzl Net Group, University of Rome Tor Vergata 25. 09. 2017
Goal Dissect TCP into general-purpose transport protocol modules such that some can become hardware primitives So that we can SDN-enable TCP and other transports End result: simpler code, hardware-supported, platform-independent Which modules are there?
Transport modules Module TCP Status Connection management ? Buffer management / communication with app (sender, receiver) Constructing headers (data, ACK) Hardware support, I think? Sending packets (data, ACK) TSO, GSO, pacing Receiving and parsing packets (data, ACK) Checksum calculation Hardware support exists Flow control = receiver buffer management? Congestion control "Pluggable" in Linux and FreeBSD Loss recovery A mess! (also messes up CC "module")
TCP today Where we want to be. When we don't lose packets, even with an ECN cwnd reduction, we stay there! Start the connection; we're clueless! Designed as a replacement for starting with, e.g., cwnd=370 CA Where we end up after loss FR SS http://theoatmeal.com Not "pluggable" today! Noooooooo !!!!! NOOOOOO !!!!!! DON'T LET IT HAPPEN !!!
The "ACK clocking" rule Packet conservation principle important, bla bla ACK clock, must preserve, bla bla Is a bursty ACK-clocked TCP better, or a paced non-ACK clocked TCP? CA: cwnd+=1 breaks ACK clocking SS breaks ACK clocking IW10 breaks ACK clocking TLP/RACK = FR slightly deviating from ACK clocking! Strict ACK-clocking only applied in FR What good has it done us?
ACK-clocking in FR Estimate "pipe": number of packets in flight Try to keep that constant "in flight" really means: "in flight" + "in queue" Try hard to keep the queue filled??? Fantastic! For instance, can't handle drops of retransmits This has been called a "feature" RACK can handle it... but (currently?) won't reduce cwnd again...
Standing queues do exist: Reno... All tests with: 3-host topology, 5 Mbit/s bottleneck in CORE emulator; 1500-byte packets Default queue: DropTail (FIFO), 100 packets Qlen exceeds BDP in all our tests (base RTT 100ms: BDP = 41 packets).
... and Cubic...
Remember PRR? It can make things worse, it seems! RFC 6675 ack# X 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 cwnd: 20 20 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 pipe: 19 19 18 18 17 16 15 14 13 12 11 10 10 10 10 10 10 10 10 sent: N N R N N N N N N N N Rate-Halving (Linux) cwnd: 20 20 19 18 18 17 17 16 16 15 15 14 14 13 13 12 12 11 11 pipe: 19 19 18 18 17 17 16 16 15 15 14 14 13 13 12 12 11 11 10 sent: N N R N N N N N N N N Queue drains a little
This can cause a "double drop"... From: "Virtualized Congestion Control" Tech.rep. longer version of SIGCOMM'16 paper http://webee.technion.ac.il/~isaac/p/sigcomm16_vcc_extended.pdf
A test from one of my Ph.D. students Is this PRR? We believe so... But everyone thinks that PRR is only a good thing?
Solving the loss recovery problem Basic function that all protocols need: Remember which packets were sent / ACKed, and when, for re-sending and RTT calculation ("scoreboard") I claim: scoreboard operations could be much simpler, and that might even make them better Going in the direction of RACK, but "all the way": base everything on a timer
What I envision SS CA Only at the beginning! Simple rules for increase/decrease events (magnitude determined by CC like before) Increase: upon ACK Decrease: upon ECN or loss Loss determined via (aggressive, not RTO!) per-packet timeout; reduce every time! Undo if we got it wrong (ACKs that shouldn't have arrived – spurious loss det.), adjust timers Avoid over-reacting: look at ACK rate + RTT No need for RTO with SS because we back-off exponentially (instead of: cwnd*=factor, then cwnd=1) Already done today with ECN !
End result Much simpler More modular more robust? As good as PRR but can better handle lost retransmits? Any other direct benefits? More modular
Thoughts?