Detail: Reducing the Flow Completion Time Tail in Datacenter Networks SIGCOMM 2012 2013.05.20 -PIGGY.

Slides:



Advertisements
Similar presentations
A Switch-Based Approach to Starvation in Data Centers Alex Shpiner and Isaac Keslassy Department of Electrical Engineering, Technion. Gabi Bracha, Eyal.
Advertisements

Finishing Flows Quickly with Preemptive Scheduling
Deconstructing Datacenter Packet Transport Mohammad Alizadeh, Shuang Yang, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker Stanford University.
Lecture 18: Congestion Control in Data Center Networks 1.
 Liang Guo  Ibrahim Matta  Computer Science Department  Boston University  Presented by:  Chris Gianfrancesco and Rick Skowyra.
Cloud Control with Distributed Rate Limiting Raghaven et all Presented by: Brian Card CS Fall Kinicki 1.
PFabric: Minimal Near-Optimal Datacenter Transport Mohammad Alizadeh Shuang Yang, Milad Sharif, Sachin Katti, Nick McKeown, Balaji Prabhakar, Scott Shenker.
By Arjuna Sathiaseelan Tomasz Radzik Department of Computer Science King’s College London EPDN: Explicit Packet Drop Notification and its uses.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
Balajee Vamanan et al. Deadline-Aware Datacenter TCP (D 2 TCP) Balajee Vamanan, Jahangir Hasan, and T. N. Vijaykumar.
Copyright © 2005 Department of Computer Science 1 Solving the TCP-incast Problem with Application-Level Scheduling Maxim Podlesny, University of Waterloo.
Introduction 1 Lecture 14 Transport Layer (Transmission Control Protocol) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer.
The War Between Mice and Elephants LIANG GUO, IBRAHIM MATTA Computer Science Department Boston University ICNP (International Conference on Network Protocols)
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.
The War Between Mice and Elephants Liang Guo and Ibrahim Matta Boston University ICNP 2001 Presented by Thangam Seenivasan 1.
The War Between Mice and Elephants Presented By Eric Wang Liang Guo and Ibrahim Matta Boston University ICNP
Congestion Control Tanenbaum 5.3, /12/2015Congestion Control (A Loss Based Technique: TCP)2 What? Why? Congestion occurs when –there is no reservation.
Transport Layer3-1 Congestion Control. Transport Layer3-2 Principles of Congestion Control Congestion: r informally: “too many sources sending too much.
Defense: Christopher Francis, Rumou duan Data Center TCP (DCTCP) 1.
ISCSI Performance in Integrated LAN/SAN Environment Li Yin U.C. Berkeley.
RAP: An End-to-End Rate-Based Congestion Control Mechanism for Realtime Streams in the Internet Reza Rejai, Mark Handley, Deborah Estrin U of Southern.
Computer Science 1 Providing QoS through Active Domain Management Liang Guo, Ibrahim Matta Quality-of-Service Networking Lab CS Department Boston University.
A Switch-Based Approach to Starvation in Data Centers Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty of Electrical.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
Data Communication and Networks
Information-Agnostic Flow Scheduling for Commodity Data Centers
Junxian Huang 1 Feng Qian 2 Yihua Guo 1 Yuanyuan Zhou 1 Qiang Xu 1 Z. Morley Mao 1 Subhabrata Sen 2 Oliver Spatscheck 2 1 University of Michigan 2 AT&T.
Mohammad Alizadeh, Abdul Kabbani, Tom Edsall,
Practical TDMA for Datacenter Ethernet
Mohammad Alizadeh Stanford University Joint with: Abdul Kabbani, Tom Edsall, Balaji Prabhakar, Amin Vahdat, Masato Yasuda HULL: High bandwidth, Ultra Low-Latency.
Curbing Delays in Datacenters: Need Time to Save Time? Mohammad Alizadeh Sachin Katti, Balaji Prabhakar Insieme Networks Stanford University 1.
A Simulation of Adaptive Packet Size in TCP Congestion Control Zohreh Jabbari.
Google File System Simulator Pratima Kolan Vinod Ramachandran.
These materials are licensed under the Creative Commons Attribution-Noncommercial 3.0 Unported license (
Performance Evaluation of L3 Transport Protocols for IEEE (2 nd round) Richard Rouil, Nada Golmie and David Griffith National Institute of Standards.
On the Data Path Performance of Leaf-Spine Datacenter Fabrics Mohammad Alizadeh Joint with: Tom Edsall 1.
1 Congestion Control Computer Networks. 2 Where are we?
Wei Bai with Li Chen, Kai Chen, Dongsu Han, Chen Tian, Hao Wang SING HKUST Information-Agnostic Flow Scheduling for Commodity Data Centers 1 SJTU,
Mitigating Congestion in Wireless Sensor Networks Bret Hull, Kyle Jamieson, Hari Balakrishnan Networks and Mobile Systems Group MIT Computer Science and.
1 1 July 28, Goal of this session is too have a discussion where we learn about the relevant data to help us understand the problem and design.
A Self-Configuring RED Gateway Wu-chang Feng, Dilip Kandlur, Debanjan Saha, Kang Shin INFOCOM ‘99.
1 Transport Layer Lecture 10 Imran Ahmed University of Management & Technology.
Computer Networking Lecture 18 – More TCP & Congestion Control.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
TimeThief: Leveraging Network Variability to Save Datacenter Energy in On-line Data- Intensive Applications Balajee Vamanan (Purdue UIC) Hamza Bin Sohail.
The Macroscopic behavior of the TCP Congestion Avoidance Algorithm.
CATNIP – Context Aware Transport/Network Internet Protocol Carey Williamson Qian Wu Department of Computer Science University of Calgary.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
Mitigating Congestion in Wireless Sensor Networks Bret Hull, Kyle Jamieson, Hari Balakrishnan MIT Computer Science and Artificial Intelligence Laborartory.
11 CS716 Advanced Computer Networks By Dr. Amir Qayyum.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
Revisiting Transport Congestion Control Jian He UT Austin 1.
Data Center TCP (DCTCP)
Data Center TCP (DCTCP)
Incast-Aware Switch-Assisted TCP Congestion Control for Data Centers
Resilient Datacenter Load Balancing in the Wild
5/3/2018 3:51 AM Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter Yuanwei Lu1,2, Guo Chen2, Zhenyuan Ruan1,2, Wencong Xiao2,3,
OTCP: SDN-Managed Congestion Control for Data Center Networks
Chris Cai, Shayan Saeed, Indranil Gupta, Roy Campbell, Franck Le
Microsoft Research Stanford University
Hamed Rezaei, Mojtaba Malekpourshahraki, Balajee Vamanan
11/13/ :11 PM Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter Yuanwei Lu1,2, Guo Chen2, Zhenyuan Ruan1,2, Wencong Xiao2,3,
Providing QoS through Active Domain Management
Cross-Layer Optimizations between Network and Compute in Online Services Balajee Vamanan.
Fast Congestion Control in RDMA-Based Datacenter Networks
Centralized Arbitration for Data Centers
Lecture 16, Computer Networks (198:552)
Lecture 17, Computer Networks (198:552)
Chapter 3 outline 3.1 Transport-layer services
Presentation transcript:

Detail: Reducing the Flow Completion Time Tail in Datacenter Networks SIGCOMM PIGGY

OUTLINE Motivation Impact of the long tail Causes of long tails DETAIL Experiment Experimental result

TYPICAL FACEBOOK PAGE Modern pages have MANY components

CREATING A PAGE

REQUIREMENT Servers must perform 100’s of data retrievals Many of which must be performed serially While meeting a deadline of ms SLA measured at the 99.9th percentile Only have 2-3ms per data retrieval Including communication and computation

MEASURED RTT Median RTT takes 334μs, but 6% take over 2ms Can be as high as 14ms

WHY THE TAIL MATTERS Recall: 100’s of data retrievals per page creation The unlikely event of a data retrieval taking too long is likely to happen on every page creation Data retrieval dependencies can magnify impact

PARTITION-AGGREGATE 10ms deadline

SEQUENTIAL Under the RTT distribution, 150 data retrievals take 200ms (ignoring computation time)

APP-LEVEL MITIGATION Use timeouts & retries for critical data retrievals Inefficient because of high network variance Choose from conservative timeouts and long delays or tight timeouts and increased server load Hide the problem from the user By caching and serving stale data Rendering pages incrementally User often notices, becomes annoyed / frustrated

CAUSES OF LONG TAILS Data retrievals are short, highly variable flows Typically under 20KB in size, with many under 2KB Short flows provide insufficient information for transport to agilely respond to packet drops Variable flow sizes decrease efficacy of network layer load balancers

DETAIL Use in-network mechanisms to maximize agility Remove restrictions that hinder performance Well-suited for datacenters Single administrative domain Reduced backward compatibility requirements

DETAIL STACK

SWITCH ARCHITECTURE

LINK LAYER

NETWORK LAYER Congestion-based congestion balancing Adaptive load balancing

LOAD BALANCING EFFICIENTLY DC flows have varying timeliness requirements* How to efficiently consider packet priority? Compare queue occupancies for every decision How to efficiently compare many of them?

PRIORITY IN LOAD BALANCING

REORDER-RESISTANT TRANSPORT LAYER Handle packet reordering due to load balancing Disable TCP’s fast recovery and fast retransmission Respond to congestion (no more packet drops) Monitor output queues and use ECN to throttle flows

SIMULATION AND IMPLEMENTATION NS-3 simulation Click implementation Drivers and NICs buffer hundreds of packets Must rate-limit Click to underflow buffers

TOPOLOGY FatTree: 128-server (NS-3) / 16-server (Click) Oversubscription factor of 4x

SETUP Baseline TCP NewReno Flow Hashing Lossless Packet Scatter Prioritization of data retrievals vs. background Metric Reduction in 99.9th percentile completion time

PAGE CREATION WORKLOAD Retrieval size: 2, 4, 8, 16, 32 KB Background traffic: 1MB flows

LONG BACKGROUND FLOWS Background Traffic: 1, 16, 64MB flows Light data retrieval traffic

CONCLUSION Long tail harms page creation The extreme case becomes the common case Limits number of data retrievals per page The DeTail stack improves long tail performance Can reduce the 99.9th percentile by more than 50%