A Switch-Based Approach to Starvation in Data Centers Alex Shpiner and Isaac Keslassy Department of Electrical Engineering, Technion. Gabi Bracha, Eyal.

Slides:



Advertisements
Similar presentations
EE:450 – Computer Networks
Advertisements

EE384Y: Packet Switch Architectures
EE384y: Packet Switch Architectures
Greening Backbone Networks Shutting Off Cables in Bundled Links Will Fisher, Martin Suchara, and Jennifer Rexford Princeton University.
Reconsidering Reliable Transport Protocol in Heterogeneous Wireless Networks Wang Yang Tsinghua University 1.
Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
TCP and Congestion Control
Helping TCP Work at Gbps Cheng Jin the FAST project at Caltech
Doc.: IEEE /037r1 Submission March 2001 Khaled Turki et. al,Texas InstrumentsSlide 1 Simulation Results for p-DCF, v-DCF and Legacy DCF Khaled.
All Rights Reserved, Copyright(C) 2007, Hitachi, Ltd. 1 Transport-layer optimization for thin-client systems Yukio OGAWA Systems Development Laboratory,
Improving Datacenter Performance and Robustness with Multipath TCP
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
When TCP Friendliness Becomes Harmful Amit Mondal Aleksandar Kuzmanovic Northwestern University
Streaming Video over the Internet
1 Maintaining Packet Order in Two-Stage Switches Isaac Keslassy, Nick McKeown Stanford University.
1 EE 122: Networks Performance & Modeling Ion Stoica TAs: Junda Liu, DK Moon, David Zats (Materials with thanks.
Chapter 4 Memory Management Basic memory management Swapping
TCP Probe: A TCP with Built-in Path Capacity Estimation Anders Persson, Cesar Marcondes, Ling-Jyh Chen, Li Lao, M. Y. Sanadidi, Mario Gerla Computer Science.
Junchen Jiang (CMU) Vyas Sekar (Stony Brook U)
Routing and Congestion Problems in General Networks Presented by Jun Zou CAS 744.
RED-PD: RED with Preferential Dropping Ratul Mahajan Sally Floyd David Wetherall.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
One More Bit Is Enough Yong Xia, RPI Lakshmi Subramanian, UCB Ion Stoica, UCB Shiv Kalyanaraman, RPI SIGCOMM’ 05, Philadelphia, PA 08 / 23 / 2005.
Week 1.
On Individual and Aggregate TCP Performance Lili Qiu Yin Zhang Srinivasan Keshav Cornell University 7th International Conference on Network Protocols Toronto,
1 GPS Example 2: Arrivals o Eleven Sources. First source gets 0.5. Other 10 sources get 0.05 each. First source sends 11 cells send one each at t=0.
RED Enhancement Algorithms By Alina Naimark. Presented Approaches Flow Random Early Drop - FRED By Dong Lin and Robert Morris Sabilized Random Early Drop.
Deconstructing Datacenter Packet Transport Mohammad Alizadeh, Shuang Yang, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker Stanford University.
CSIT560 Internet Infrastructure: Switches and Routers Active Queue Management Presented By: Gary Po, Henry Hui and Kenny Chong.
Congestion Control Algorithms: Open Questions Benno Overeinder NLnet Labs.
PFabric: Minimal Near-Optimal Datacenter Transport Mohammad Alizadeh Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker.
Congestion Control: TCP & DC-TCP Swarun Kumar With Slides From: Prof. Katabi, Alizadeh et al.
Modeling the Interactions of Congestion Control and Switch Scheduling Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty.
Selfish Behavior and Stability of the Internet: A Game-Theoretic Analysis of TCP Presented by Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Copyright © 2005 Department of Computer Science 1 Solving the TCP-incast Problem with Application-Level Scheduling Maxim Podlesny, University of Waterloo.
Fair queueing and congestion control Jim Roberts (France Telecom) Joint work with Jordan Augé Workshop on Congestion Control Hamilton Institute, Sept 2005.
Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat David Andersen, Greg Ganger, Garth Gibson, Brian Mueller* Carnegie Mellon University, *Panasas.
1 Minseok Kwon and Sonia Fahmy Department of Computer Sciences Purdue University {kwonm, TCP Increase/Decrease.
Modeling TCP in Small-Buffer Networks
1 TCP-LP: A Distributed Algorithm for Low Priority Data Transfer Aleksandar Kuzmanovic, Edward W. Knightly Department of Electrical and Computer Engineering.
A Switch-Based Approach to Starvation in Data Centers Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty of Electrical.
Reducing the Buffer Size in Backbone Routers Yashar Ganjali High Performance Networking Group Stanford University February 23, 2005
Computer Networking Lecture 17 – Queue Management As usual: Thanks to Srini Seshan and Dave Anderson.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
Low-Rate TCP Denial of Service Defense Johnny Tsao Petros Efstathopoulos Tutor: Guang Yang UCLA 2003.
Practical TDMA for Datacenter Ethernet
IA-TCP A Rate Based Incast- Avoidance Algorithm for TCP in Data Center Networks Communications (ICC), 2012 IEEE International Conference on 曾奕勳.
Curbing Delays in Datacenters: Need Time to Save Time? Mohammad Alizadeh Sachin Katti, Balaji Prabhakar Insieme Networks Stanford University 1.
TCP Throughput Collapse in Cluster-based Storage Systems
An Efficient Approach for Content Delivery in Overlay Networks Mohammad Malli Chadi Barakat, Walid Dabbous Planete Project To appear in proceedings of.
CA-RTO: A Contention- Adaptive Retransmission Timeout I. Psaras, V. Tsaoussidis, L. Mamatas Demokritos University of Thrace, Xanthi, Greece This study.
ACN: RED paper1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug.
Stochastic Fair Blue: A Queue Management Algorithm for Enforcing Fairness W. Feng, D. Kandlur, D. Saha, and K. Shin Presented by King-Shan Lui.
Packet Scheduling and Buffer Management Switches S.Keshav: “ An Engineering Approach to Networking”
15744 Course Project1 Evaluation of Queue Management Algorithms Ningning Hu, Liu Ren, Jichuan Chang 30 April 2001.
CS640: Introduction to Computer Networks Aditya Akella Lecture 20 - Queuing and Basics of QoS.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
1 Fair Queuing Hamed Khanmirza Principles of Network University of Tehran.
Univ. of TehranIntroduction to Computer Network1 An Introduction Computer Networks An Introduction to Computer Networks University of Tehran Dept. of EE.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
ICTCP: Incast Congestion Control for TCP in Data Center Networks By: Hilfi Alkaff.
Analysis and Comparison of TCP Reno and TCP Vegas Review
Carnegie Mellon University, *Panasas Inc.
Jiyong Park Seoul National University, Korea
Presentation transcript:

A Switch-Based Approach to Starvation in Data Centers Alex Shpiner and Isaac Keslassy Department of Electrical Engineering, Technion. Gabi Bracha, Eyal Dagan, Ofer Iny and Eyal Soha Broadcom. Received the best paper award at IEEE IWQoS10 (International Workshop on Quality of Service).

2 The Problem Temporary starvation of long TCP flows in datacenter networks Temporary starvation of long TCP flows in datacenter networks Crucial effect on applications (e.g. real-time, distributed computing). Outline: Characterization of the datacenter network. Why starvation happens? Switch-based solution.

3 Datacenter Network Low propagation times (t p ) t p µs, instead of t p ms in Internet Datacenter model:

4 Datacenter Network Low propagation times (t p ) t p µs, instead of t p ms in Internet Datacenter model: Small t p => Small buffers B=C* t p (rule-of-thumb) [Villamizar et al., 1994] Many users with long TCP flows (Large N) B C= 10Gbps

5 Why Starvation? Total number of packets (Cwnd) >> Network capacity. LargeSmall Links and buffers cannot hold all packets of all flows, even if for each flow, congestion window Cwnd i = 1 packet. High drop rate Timeouts Starvation B C= flows packets in links packets in buffers packets

6 Starvation (Simulations) Distribution of max. starvation time Max. starvation time (sec) Simulation parameters: 400 TCP flows, Link Capacity = 100 Mbps, prop. RTT = 0.1 ms, buffer = 20 packets, packet size = 1500 Bytes, UDP rate = 5% of link capacity. = time between two successfully transmitted packets Number of flows

7 Unfairness (Simulations) Distribution of throughput per flow (Unfairness) Simulation parameters: 400 TCP flows, Link Capacity = 100 Mbps, prop. RTT = 0.1 ms, buffer = 20 packets, packet size = 1500 Bytes, UDP rate = 5% of link capacity, examined time (T) = 10 sec. Number of flows Throughput (pkts/T)

The Goal 1. Reduce starvation of the long TCP flows. 2. Switch-based solution for datacenter. Transparent to the end hosts. No change in network topology. No significant impact on the switch architecture. No additional buffering. 8

9 Alternative solutions TCP throughput collapse (InCast) solutions (requires changes in TCP or in application) Reducing and randomizing retransmission timeouts [V. Vasudevan et al., 2009]. Increasing SRU size, changing TCP [A. Phanishayee et al., 2008]. Limiting the number of servers, global scheduling [E. Krevat et al., 2007]. Larger buffers [R. Morris, 1997] High delays, requires DRAM memories.

Solution Idea 10 X OK B=2 pkts

11 Alternative Fairness Algorithms Deficit Round-Robin (DRR) [M. Shreedhar and G. Varghese, 1996]. Stochastic Fair Queuing (SFQ) [P.McKenney, 1990] Drawbacks: Inefficient buffer utilization (e.g. with bursts). Complicated queue management (RR, LQF).

12 Hashed Credits Fair (HCF) Bins provide fairness HP queue avoids starvation LP queue provides high output link utilization Time divided into priority periods: At the start of each – reset credits and change hash function Fixed vs. dynamic period Credits

13 Hashed Credits Fair (HCF) Complexity Credits Complexity: Enqueueing: O(1) Dequeuing: O(1) Initialization: O(num. of bins) Memory space: Bin array: O(num.of bins* log(Max. Credits)) Additional queue pointers: O(1) practically: O(1) }

Preventing Packet Reordering Solution: Queue swapping Dynamic priority period Period ends when HP queue empties. 14 New priority period Reordering! 1 32

Preventing Packet Reordering 15 New priority period No Reordering! Solution: Queue swapping Dynamic priority period Period ends when HP queue empties. 132

16 FIFO vs. HCF Starvation Distribution of Max. Starvation Times Simulation parameters: 400 TCP flows, Link Capacity = 100 Mbps, Prop. RTT = 0.1 ms, Buffer = 20 packets, Packet Size = 1500 Bytes, UDP Rate = 5% of link capacity. after before Max. Starvation time (sec) Number of flows

17 FIFO vs. HCF Unfairness Distribution of Throughput per flow (Unfairness) Simulation parameters: 400 TCP flows, Link Capacity = 100 Mbps, Prop. RTT = 0.1 ms, Buffer = 20 packets, Packet Size = 1500 Bytes, UDP Rate = 5% of link capacity, Examined Time (T) = 10 sec. before after Throughput (pkts/T) Number of flows

18 Influence of Buffer Size Starvation ratio – Percentage of starved flows in 10 seconds Large buffers prevent starvation. Simulation parameters: N = 400 TCP flows, UDP rate = 5%*C out, C out = 100 Mbps, t p = 0.1 ms, Packet size = 1500 Bytes, Examined time = 10 sec.

Another Application: Throughput Collapse (InCast) 19 R R R 1 2 N Servers Client High drop rate Timeouts Low Goodput 2 N Links are idle

Throughput Collapse (InCast) (Simulations) [V. Vasudevan et al., 2008, 2009] 20

FIFO vs. HCF Incast 21 GoodputMax. starvation time Simulation parameters: Link Capacity = 10 Gbps, Prop. RTT = 0.02 ms, Buffer = 32 packets, Block Size = 80 MB, Packet Size = 1000 Bytes, no UDP.

22Summary Novel Observation: Long TCP flows in datacenter networks can severely suffer from starvation. New Algorithm: Reduces the starvation. Transparent to end-user. Application to TCP InCast Problem.

Thank you.