A Switch-Based Approach to Starvation in Data Centers Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty of Electrical.

Slides:



Advertisements
Similar presentations
Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
Advertisements

A Switch-Based Approach to Starvation in Data Centers Alex Shpiner and Isaac Keslassy Department of Electrical Engineering, Technion. Gabi Bracha, Eyal.
1 GPS Example 2: Arrivals o Eleven Sources. First source gets 0.5. Other 10 sources get 0.05 each. First source sends 11 cells send one each at t=0.
RED Enhancement Algorithms By Alina Naimark. Presented Approaches Flow Random Early Drop - FRED By Dong Lin and Robert Morris Sabilized Random Early Drop.
Deconstructing Datacenter Packet Transport Mohammad Alizadeh, Shuang Yang, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker Stanford University.
CSIT560 Internet Infrastructure: Switches and Routers Active Queue Management Presented By: Gary Po, Henry Hui and Kenny Chong.
Congestion Control Algorithms: Open Questions Benno Overeinder NLnet Labs.
PFabric: Minimal Near-Optimal Datacenter Transport Mohammad Alizadeh Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker.
Playback-buffer Equalization for Streaming Media using Stateless Transport Prioritization Dan Tan, HPL, Palo Alto Weidong Cui, UC Berkeley John Apostolopoulos,
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
Congestion Control: TCP & DC-TCP Swarun Kumar With Slides From: Prof. Katabi, Alizadeh et al.
CS 268: Lecture 8 Router Support for Congestion Control Ion Stoica Computer Science Division Department of Electrical Engineering and Computer Sciences.
CS 4700 / CS 5700 Network Fundamentals Lecture 12: Router-Aided Congestion Control (Drop it like it’s hot) Revised 3/18/13.
Modeling the Interactions of Congestion Control and Switch Scheduling Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty.
Selfish Behavior and Stability of the Internet: A Game-Theoretic Analysis of TCP Presented by Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Copyright © 2005 Department of Computer Science 1 Solving the TCP-incast Problem with Application-Level Scheduling Maxim Podlesny, University of Waterloo.
Network Border Patrol: Preventing Congestion Collapse and Promoting Fairness in the Internet Celio Albuquerque, Brett J. Vickers, Tatsuya Suda 1.
Fair queueing and congestion control Jim Roberts (France Telecom) Joint work with Jordan Augé Workshop on Congestion Control Hamilton Institute, Sept 2005.
Sizing Router Buffers Guido Appenzeller Isaac Keslassy Nick McKeown Stanford University.
Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat David Andersen, Greg Ganger, Garth Gibson, Brian Mueller* Carnegie Mellon University, *Panasas.
Congestion control in data centers
Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.
High Performance All-Optical Networks with Small Buffers Yashar Ganjali High Performance Networking Group Stanford University
Sizing Router Buffers (Summary)
Sizing Router Buffers Nick McKeown Guido Appenzeller & Isaac Keslassy SNRC Review May 27 th, 2004.
1 Minseok Kwon and Sonia Fahmy Department of Computer Sciences Purdue University {kwonm, TCP Increase/Decrease.
WB-RTO: A Window-Based Retransmission Timeout Ioannis Psaras, Vassilis Tsaoussidis Demokritos University of Thrace, Xanthi, Greece.
Modeling TCP in Small-Buffer Networks
1 TCP-LP: A Distributed Algorithm for Low Priority Data Transfer Aleksandar Kuzmanovic, Edward W. Knightly Department of Electrical and Computer Engineering.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
Reducing the Buffer Size in Backbone Routers Yashar Ganjali High Performance Networking Group Stanford University February 23, 2005
Isaac Keslassy (Technion) Guido Appenzeller & Nick McKeown (Stanford)
ACN: Congestion Control1 Congestion Control and Resource Allocation.
Computer Networking Lecture 17 – Queue Management As usual: Thanks to Srini Seshan and Dave Anderson.
The War Between Mice and Elephants By Liang Guo (Graduate Student) Ibrahim Matta (Professor) Boston University ICNP’2001 Presented By Preeti Phadnis.
TCP Congestion Control
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
Low-Rate TCP Denial of Service Defense Johnny Tsao Petros Efstathopoulos Tutor: Guang Yang UCLA 2003.
Diffusion Mechanisms for Active Queue Management Department of Electrical and Computer Engineering University of Delaware Aug 19th / 2004 Rafael Nunez.
IA-TCP A Rate Based Incast- Avoidance Algorithm for TCP in Data Center Networks Communications (ICC), 2012 IEEE International Conference on 曾奕勳.
TCP & Data Center Networking
Curbing Delays in Datacenters: Need Time to Save Time? Mohammad Alizadeh Sachin Katti, Balaji Prabhakar Insieme Networks Stanford University 1.
CS144 An Introduction to Computer Networks
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
Detail: Reducing the Flow Completion Time Tail in Datacenter Networks SIGCOMM PIGGY.
TCP Throughput Collapse in Cluster-based Storage Systems
CEDAR Counter-Estimation Decoupling for Approximate Rates Erez Tsidon Joint work with Iddo Hanniel and Isaac Keslassy Technion, Israel 1.
Computer Networks Performance Metrics. Performance Metrics Outline Generic Performance Metrics Network performance Measures Components of Hop and End-to-End.
CA-RTO: A Contention- Adaptive Retransmission Timeout I. Psaras, V. Tsaoussidis, L. Mamatas Demokritos University of Thrace, Xanthi, Greece This study.
ACN: RED paper1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug.
1 On Class-based Isolation of UDP, Short-lived and Long-lived TCP Flows by Selma Yilmaz Ibrahim Matta Computer Science Department Boston University.
27th, Nov 2001 GLOBECOM /16 Analysis of Dynamic Behaviors of Many TCP Connections Sharing Tail-Drop / RED Routers Go Hasegawa Osaka University, Japan.
Congestion Control in CSMA-Based Networks with Inconsistent Channel State V. Gambiroza and E. Knightly Rice Networks Group
50 th Annual Allerton Conference, 2012 On the Capacity of Bufferless Networks-on-Chip Alex Shpiner, Erez Kantor, Pu Li, Israel Cidon and Isaac Keslassy.
Stochastic Fair Blue: A Queue Management Algorithm for Enforcing Fairness W. Feng, D. Kandlur, D. Saha, and K. Shin Presented by King-Shan Lui.
Packet Scheduling and Buffer Management Switches S.Keshav: “ An Engineering Approach to Networking”
15744 Course Project1 Evaluation of Queue Management Algorithms Ningning Hu, Liu Ren, Jichuan Chang 30 April 2001.
TCP with Variance Control for Multihop IEEE Wireless Networks Jiwei Chen, Mario Gerla, Yeng-zhong Lee.
CS640: Introduction to Computer Networks Aditya Akella Lecture 20 - Queuing and Basics of QoS.
WB-RTO: A Window-Based Retransmission Timeout Ioannis Psaras Demokritos University of Thrace, Xanthi, Greece.
We used ns-2 network simulator [5] to evaluate RED-DT and compare its performance to RED [1], FRED [2], LQD [3], and CHOKe [4]. All simulation scenarios.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
Univ. of TehranIntroduction to Computer Network1 An Introduction Computer Networks An Introduction to Computer Networks University of Tehran Dept. of EE.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
1 Network Transport Layer: TCP Analysis and BW Allocation Framework Y. Richard Yang 3/30/2016.
ICTCP: Incast Congestion Control for TCP in Data Center Networks By: Hilfi Alkaff.
Analysis and Comparison of TCP Reno and TCP Vegas Review
Carnegie Mellon University, *Panasas Inc.
Jiyong Park Seoul National University, Korea
TCP Congestion Control
Presentation transcript:

A Switch-Based Approach to Starvation in Data Centers Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty of Electrical Engineering, Technion, Haifa, Israel

2 The Problem Temporary starvation of long TCP flows in datacenter networks Temporary starvation of long TCP flows in datacenter networks Cooperated with (formerly )  Crucial effect on applications (e.g. real-time, distributed computing).  Outline:  Characterization of the datacenter network.  Why does starvation happen?  Switch-based solution.

3 Datacenter Network  Low propagation times (t p )  t p ≈ µs, instead of t p ≈ ms in Internet  Simple datacenter model:  Small t p => Small buffers  B=C* t p (rule-of-thumb) [Villamizar et al., 1994]  Many users with long TCP flows (Large N) B C= 10Gbps

4 Why Starvation?  Total sum of packets (∑Cwnd) >> Network capacity. LargeSmall  Links and buffers cannot hold all packets of all flows, even if for each flow, congestion window Cwnd i = 1. High drop rate Timeouts Starvation B C= flowslinksbufferspackets

5 Starvation (Simulations) Distribution of max. starvation time Max. starvation time (sec) Simulation parameters: 400 TCP flows, Link Capacity = 100 Mbps, prop. RTT = 0.1 ms, buffer = 20 packets, packet size = 1500 Bytes, UDP rate = 5% of link capacity. = time between two successfully transmitted packets Number of flows

6 Unfairness (Simulations) Distribution of throughput per flow (Unfairness) Simulation parameters: 400 TCP flows, Link Capacity = 100 Mbps, prop. RTT = 0.1 ms, buffer = 20 packets, packet size = 1500 Bytes, UDP rate = 5% of link capacity, examined time (T) = 10 sec. Number of flows Throughput (pkts/T)

The Goal 1. Reduce starvation of the long TCP flows. 2. Switch-based solution for datacenter. Alternative solutions:  TCP throughput collapse (InCast) solutions (requires changes in TCP or in application)  Reducing and randomizing retransmission timeouts [V. Vasudevan et al., 2009].  Increasing SRU size, changing TCP [A. Phanishayee et al., 2008].  Limiting the number of servers, global scheduling [E. Krevat et al., 2007].  Larger buffers [R. Morris, 1997]  High delays, requires DRAM memories. 7

8Objectives  Transparent to the end hosts.  No change in network topology.  No significant impact on the switch architecture.  No additional buffering.

The Idea 9 X OK B=2 pkts

10 Alternative Fairness Algorithm  Deficit Round-Robin (DRR) [M. Shreedhar and G. Varghese, 1996].  Stochastic Fair Queuing (SFQ) [P.McKenney, 1990]  Drawbacks:  Inefficient buffer utilization (e.g. with bursts).  Complicated queue management (RR, LQF).

11 Hashed Credits Fair (HCF)  Bins provide fairness  HP queue avoids starvation  LP queue provides high output link utilization  Time divided into priority periods: at the start of each – reset credits and change parameters to hash function Credits

12 Hashed Credits Fair (HCF) Complexity Credits Complexity: Enqueueing: O(1) Dequeuing: O(1) Initialization: O(num. of bins) Memory space: Bin array: O(num.of bins* log(Max. Credits)) Additional queue pointers: O(1) practically: O(1) }

13 FIFO vs. HCF Starvation Distribution of Max. Starvation Times Simulation parameters: 400 TCP flows, Link Capacity = 100 Mbps, Prop. RTT = 0.1 ms, Buffer = 20 packets, Packet Size = 1500 Bytes, UDP Rate = 5% of link capacity. after before Max. Starvation time (sec) Number of flows

14 FIFO vs. HCF Unfairness Distribution of Throughput per flow (Unfairness) Simulation parameters: 400 TCP flows, Link Capacity = 100 Mbps, Prop. RTT = 0.1 ms, Buffer = 20 packets, Packet Size = 1500 Bytes, UDP Rate = 5% of link capacity, Examined Time (T) = 10 sec. before after Throughput (pkts/T) Number of flows

15 Influence of Buffer Size Starvation ratio – Percentage of starved flows in 10 seconds  Large buffers prevent starvation. Simulation parameters: N = 400 TCP flows, UDP rate = 5%*C out, C out = 100 Mbps, t p = 0.1 ms, Packet size = 1500 Bytes, Examined time = 10 sec.

Another Application: Throughput Collapse (InCast) 16 R R R 1 2 N Servers Client High drop rate Timeouts Low Goodput 2 N Links are idle

Throughput Collapse (InCast) (Simulations) [V. Vasudevan et al., 2008, 2009] 17

FIFO vs. HCF Incast 18 GoodputMax. starvation time Simulation parameters: Link Capacity = 10 Gbps, Prop. RTT = 0.02 ms, Buffer = 32 packets, Block Size = 80 MB, Packet Size = 1000 Bytes, no UDP.

19Summary  Novel Observation:  Long TCP flows in datacenter networks can severely suffer from starvation.  New Algorithm:  Reduces the starvation.  Transparent to end-user.  Application to TCP InCast Problem.  More in the paper:  Solution to packet reordering in HCF.  Dynamic priority periods.

Thank you.