OTCP: SDN-Managed Congestion Control for Data Center Networks

Slides:



Advertisements
Similar presentations
Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
Advertisements

Interconnection Networks: Flow Control and Microarchitecture.
B 黃冠智.
Deconstructing Datacenter Packet Transport Mohammad Alizadeh, Shuang Yang, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker Stanford University.
Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, Murari Sridharan Presented by Shaddi.
PFabric: Minimal Near-Optimal Datacenter Transport Mohammad Alizadeh Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker.
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
Selfish Behavior and Stability of the Internet: A Game-Theoretic Analysis of TCP Presented by Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Copyright © 2005 Department of Computer Science 1 Solving the TCP-incast Problem with Application-Level Scheduling Maxim Podlesny, University of Waterloo.
The War Between Mice and Elephants LIANG GUO, IBRAHIM MATTA Computer Science Department Boston University ICNP (International Conference on Network Protocols)
School of Information Technologies TCP Congestion Control NETS3303/3603 Week 9.
Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat David Andersen, Greg Ganger, Garth Gibson, Brian Mueller* Carnegie Mellon University, *Panasas.
Congestion control in data centers
Defense: Christopher Francis, Rumou duan Data Center TCP (DCTCP) 1.
A Switch-Based Approach to Starvation in Data Centers Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty of Electrical.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
Data Communication and Networks
Medium Start in TCP-Friendly Rate Control Protocol CS 217 Class Project Spring 04 Peter Leong & Michael Welch.
1 K. Salah Module 6.1: TCP Flow and Congestion Control Connection establishment & Termination Flow Control Congestion Control QoS.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
Low-Rate TCP Denial of Service Defense Johnny Tsao Petros Efstathopoulos Tutor: Guang Yang UCLA 2003.
5/12/05CS118/Spring051 A Day in the Life of an HTTP Query 1.HTTP Brower application Socket interface 3.TCP 4.IP 5.Ethernet 2.DNS query 6.IP router 7.Running.
ICTCP: Incast Congestion Control for TCP in Data Center Networks∗
IA-TCP A Rate Based Incast- Avoidance Algorithm for TCP in Data Center Networks Communications (ICC), 2012 IEEE International Conference on 曾奕勳.
TCP & Data Center Networking
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
Curbing Delays in Datacenters: Need Time to Save Time? Mohammad Alizadeh Sachin Katti, Balaji Prabhakar Insieme Networks Stanford University 1.
3: Transport Layer3b-1 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle”
Transport Layer 4 2: Transport Layer 4.
Detail: Reducing the Flow Completion Time Tail in Datacenter Networks SIGCOMM PIGGY.
TCP Throughput Collapse in Cluster-based Storage Systems
On the Data Path Performance of Leaf-Spine Datacenter Fabrics Mohammad Alizadeh Joint with: Tom Edsall 1.
Understanding the Performance of TCP Pacing Amit Aggarwal, Stefan Savage, Thomas Anderson Department of Computer Science and Engineering University of.
B 李奕德.  Abstract  Intro  ECN in DCTCP  TDCTCP  Performance evaluation  conclusion.
Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.
Computer Networking Lecture 18 – More TCP & Congestion Control.
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
1 Analysis of a window-based flow control mechanism based on TCP Vegas in heterogeneous network environment Hiroyuki Ohsaki Cybermedia Center, Osaka University,
The Macroscopic behavior of the TCP Congestion Avoidance Algorithm.
H. OhsakiITCom A control theoretical analysis of a window-based flow control mechanism for TCP connections with different propagation delays Hiroyuki.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
11 CS716 Advanced Computer Networks By Dr. Amir Qayyum.
ICTCP: Incast Congestion Control for TCP in Data Center Networks By: Hilfi Alkaff.
Data Center TCP (DCTCP)
Incast-Aware Switch-Assisted TCP Congestion Control for Data Centers
6.888 Lecture 5: Flow Scheduling
Software defined networking: Experimental research on QoS
Transmission Control Protocol (TCP) Retransmission and Time-Out
Topics discussed in this section:
Approaches towards congestion control
Chapter 3 outline 3.1 transport-layer services
TCP Congestion Control at the Network Edge
Chapter 3 outline 3.1 Transport-layer services
Transport Protocols over Circuits/VCs
Microsoft Research Stanford University
Lecture 19 – TCP Performance
Congestion Control in Software Define Data Center Network
Carnegie Mellon University, *Panasas Inc.
FAST TCP : From Theory to Experiments
AMP: A Better Multipath TCP for Data Center Networks
Data Center TCP (DCTCP)
SICC: SDN-Based Incast Congestion Control For Data Centers Ahmed M
Centralized Arbitration for Data Centers
Lecture 16, Computer Networks (198:552)
TCP Congestion Control
Lecture 17, Computer Networks (198:552)
Chapter 3 outline 3.1 Transport-layer services
AMP: An Adaptive Multipath TCP for Data Center Networks
Review of Internet Protocols Transport Layer
In-network computation
Presentation transcript:

OTCP: SDN-Managed Congestion Control for Data Center Networks Simon Jouet simon.jouet@glasgow.ac.uk https://netlab.dcs.gla.ac.uk School of Computing Science

Background on TCP “For a transport endpoint embedded in a network of unknown topology and with an unknown, unknowable and constantly changing population of competing conversations, only one scheme has any hope of working – exponential backoff-” Congestion Avoidance and Control, Van Jacobson, 1988 Conservative Congestion Control Settings Minimum Retransmission Timeout (RTOmin) 200ms Initial Retransmission Timeout (RTOinit) 1s Initial Congestion Window (IW) 10 segments IEEE/IFIP NOMS - 26/04/2016

Partition Aggregate Traffic Light request to workers Synchronous replies Multiple Flows Typical of DC applications MapReduce Memcached Apache Spark … Bottleneck link Reply k Query k IEEE/IFIP NOMS - 26/04/2016

TCP Throughput Incast Collapse Many flows share same egress queue Packet dropped when buffers are full RTO is used as recovery mechanism Bursts of traffic separated by long idle period Result in low throughput and long flow completion times S RTOinit (1s) Buffer occupancy IW = 3 RTO (>200ms) RTO 2x RTO S Time IEEE/IFIP NOMS - 26/04/2016

DC Networks “[…] a WSC server is deployed in a relatively well-known environment, leading to possible optimizations for increased performance. […] lower packet losses than in long-distance Internet connections. Thus we can tune transport or messaging parameters (timeouts, window sizes, etc.) for higher communication efficiency.” The Datacenter as a Computer, Luiz André Barroso, Urs Hölzle, 2009 Compute environment specific settings RTOmin = Route Latency RTOmax = Route + Buffer latency CWNDmax = Route BDP CWNDinit (IW) = BDP / Flow fan-in Core Controller 1G 1ms Agg In DC the network properties or known or discoverable 2 – 3 orders of magnitude difference with the Internet and conservative values 1G 0.2ms ToR 10x1G 0.1ms x10 IEEE/IFIP NOMS - 26/04/2016

OTCP Information Gathering Add timestamp to topology discovery (OFDP) Controller – Switch – Switch - Controller Controller OpenFlow Request/Reply Controller – Switch - Controller ARP Probe packets Controller – Switch – Host – Switch - Controller x10 Port status for link speed Queue config for buffer sizes IEEE/IFIP NOMS - 26/04/2016

OTCP Calculations Network properties Example: Flow through Core Buffer depth of 60 packets Throughput of 1Gbps Expected Flow Fan-in α = 100 Example: Flow through Core Measured latency 5571µs 𝑅𝑇𝑂𝑚𝑖𝑛 = 6𝑚𝑠 𝑅𝑇𝑂𝑚𝑎𝑥 = 𝑅𝑇𝑂𝑚𝑖𝑛 + 60 ∗ 𝑀𝑆𝑆 1𝐺𝑏𝑝𝑠 ∗10=12.771𝑚𝑠 𝑅𝑇𝑂𝑖𝑛𝑖𝑡 = 𝑅𝑇𝑂𝑚𝑎𝑥 ∗ 2=25𝑚𝑠 𝐵𝐷𝑃=𝐿𝑎𝑡𝑒𝑛𝑐𝑦∗ 1𝐺𝑏𝑝𝑠=476𝑀𝑆𝑆 𝐼𝑊 = 𝐵𝐷𝑃 𝛼 =5 Controller x10 IEEE/IFIP NOMS - 26/04/2016

Parameters Propagation Controller exposes a northbound JSON/REST API Agent in the end-hosts connect to the API endpoint Controller calculate per-route congestion control values Push to agent on topological changes Agent update the host routing table RTT (µs) RTOmin (ms) RTOmax RTOinit CWNDmax (MSS) IW ToR 629 1 2.069 4 49 Agg 1485 2 5.805 12 127 Core 5571 6 12.771 25 476 5 IEEE/IFIP NOMS - 26/04/2016

OTCP Improvements Match the congestion control settings to the network Improve Flow completion time Improve Throughput and Goodput Improve Flow fairness Reduce latency jitter Buffer occupancy S RTOinit (4ms) S RTO (1ms) IW = 1 S Time IEEE/IFIP NOMS - 26/04/2016

FCT Evaluation (a) Mean FCT (b) 95th Percentile (s) (a) Mean FCT (s) (b) 95th Percentile IEEE/IFIP NOMS - 26/04/2016

Goodput Evaluation CDF of Flow goodput experiencing incast collapse IEEE/IFIP NOMS - 26/04/2016

Conclusion Implemented OTCP Centralized controller-based congestion control settings measurement Calculate per-route parameters based on the operating environment Improve soft-realtime partition-aggregate traffic 12x FCT improvement at the mean, 31x at the 95th percentile Low and stable latency, no bursts from the IW Higher and fairer goodput IEEE/IFIP NOMS - 26/04/2016

Questions? IEEE/IFIP NOMS - 26/04/2016