Presentation is loading. Please wait.

Presentation is loading. Please wait.

OTCP: SDN-Managed Congestion Control for Data Center Networks

Similar presentations


Presentation on theme: "OTCP: SDN-Managed Congestion Control for Data Center Networks"— Presentation transcript:

1 OTCP: SDN-Managed Congestion Control for Data Center Networks
Simon Jouet School of Computing Science

2 Background on TCP “For a transport endpoint embedded in a network of unknown topology and with an unknown, unknowable and constantly changing population of competing conversations, only one scheme has any hope of working – exponential backoff-” Congestion Avoidance and Control, Van Jacobson, 1988 Conservative Congestion Control Settings Minimum Retransmission Timeout (RTOmin) 200ms Initial Retransmission Timeout (RTOinit) 1s Initial Congestion Window (IW) 10 segments IEEE/IFIP NOMS - 26/04/2016

3 Partition Aggregate Traffic
Light request to workers Synchronous replies Multiple Flows Typical of DC applications MapReduce Memcached Apache Spark Bottleneck link Reply k Query k IEEE/IFIP NOMS - 26/04/2016

4 TCP Throughput Incast Collapse
Many flows share same egress queue Packet dropped when buffers are full RTO is used as recovery mechanism Bursts of traffic separated by long idle period Result in low throughput and long flow completion times S RTOinit (1s) Buffer occupancy IW = 3 RTO (>200ms) RTO 2x RTO S Time IEEE/IFIP NOMS - 26/04/2016

5 DC Networks “[…] a WSC server is deployed in a relatively well-known environment, leading to possible optimizations for increased performance. […] lower packet losses than in long-distance Internet connections. Thus we can tune transport or messaging parameters (timeouts, window sizes, etc.) for higher communication efficiency.” The Datacenter as a Computer, Luiz André Barroso, Urs Hölzle, 2009 Compute environment specific settings RTOmin = Route Latency RTOmax = Route + Buffer latency CWNDmax = Route BDP CWNDinit (IW) = BDP / Flow fan-in Core Controller 1G 1ms Agg In DC the network properties or known or discoverable 2 – 3 orders of magnitude difference with the Internet and conservative values 1G 0.2ms ToR 10x1G 0.1ms x10 IEEE/IFIP NOMS - 26/04/2016

6 OTCP Information Gathering
Add timestamp to topology discovery (OFDP) Controller – Switch – Switch - Controller Controller OpenFlow Request/Reply Controller – Switch - Controller ARP Probe packets Controller – Switch – Host – Switch - Controller x10 Port status for link speed Queue config for buffer sizes IEEE/IFIP NOMS - 26/04/2016

7 OTCP Calculations Network properties Example: Flow through Core
Buffer depth of 60 packets Throughput of 1Gbps Expected Flow Fan-in α = 100 Example: Flow through Core Measured latency 5571µs 𝑅𝑇𝑂𝑚𝑖𝑛 = 6𝑚𝑠 𝑅𝑇𝑂𝑚𝑎𝑥 = 𝑅𝑇𝑂𝑚𝑖𝑛 ∗ 𝑀𝑆𝑆 1𝐺𝑏𝑝𝑠 ∗10=12.771𝑚𝑠 𝑅𝑇𝑂𝑖𝑛𝑖𝑡 = 𝑅𝑇𝑂𝑚𝑎𝑥 ∗ 2=25𝑚𝑠 𝐵𝐷𝑃=𝐿𝑎𝑡𝑒𝑛𝑐𝑦∗ 1𝐺𝑏𝑝𝑠=476𝑀𝑆𝑆 𝐼𝑊 = 𝐵𝐷𝑃 𝛼 =5 Controller x10 IEEE/IFIP NOMS - 26/04/2016

8 Parameters Propagation
Controller exposes a northbound JSON/REST API Agent in the end-hosts connect to the API endpoint Controller calculate per-route congestion control values Push to agent on topological changes Agent update the host routing table RTT (µs) RTOmin (ms) RTOmax RTOinit CWNDmax (MSS) IW ToR 629 1 2.069 4 49 Agg 1485 2 5.805 12 127 Core 5571 6 12.771 25 476 5 IEEE/IFIP NOMS - 26/04/2016

9 OTCP Improvements Match the congestion control settings to the network
Improve Flow completion time Improve Throughput and Goodput Improve Flow fairness Reduce latency jitter Buffer occupancy S RTOinit (4ms) S RTO (1ms) IW = 1 S Time IEEE/IFIP NOMS - 26/04/2016

10 FCT Evaluation (a) Mean FCT (b) 95th Percentile
(s) (a) Mean FCT (s) (b) 95th Percentile IEEE/IFIP NOMS - 26/04/2016

11 Goodput Evaluation CDF of Flow goodput experiencing incast collapse
IEEE/IFIP NOMS - 26/04/2016

12 Conclusion Implemented OTCP
Centralized controller-based congestion control settings measurement Calculate per-route parameters based on the operating environment Improve soft-realtime partition-aggregate traffic 12x FCT improvement at the mean, 31x at the 95th percentile Low and stable latency, no bursts from the IW Higher and fairer goodput IEEE/IFIP NOMS - 26/04/2016

13 Questions? IEEE/IFIP NOMS - 26/04/2016


Download ppt "OTCP: SDN-Managed Congestion Control for Data Center Networks"

Similar presentations


Ads by Google