Cloud Control with Distributed Rate Limiting Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren University of California,

Slides:



Advertisements
Similar presentations
Martin Suchara, Ryan Witt, Bartek Wydrowski California Institute of Technology Pasadena, U.S.A. TCP MaxNet Implementation and Experiments on the WAN in.
Advertisements

Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
Improving Datacenter Performance and Robustness with Multipath TCP
Scheduling in Web Server Clusters CS 260 LECTURE 3 From: IBM Technical Report.
The Effects of Wide-Area Conditions on WWW Server Performance Erich Nahum, Marcel Rosu, Srini Seshan, Jussara Almeida IBM T.J. Watson Research Center,
Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Sharing Cloud Networks Lucian Popa, Gautam Kumar, Mosharaf Chowdhury Arvind Krishnamurthy, Sylvia Ratnasamy, Ion Stoica UC Berkeley.
TELE202 Lecture 8 Congestion control 1 Lecturer Dr Z. Huang Overview ¥Last Lecture »X.25 »Source: chapter 10 ¥This Lecture »Congestion control »Source:
William Stallings Data and Computer Communications 7 th Edition Chapter 13 Congestion in Data Networks.
Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, Alex C. Snoeren Defense: Rejaie Johnson, Xian Yi Teng.
Abhay.K.Parekh and Robert G.Gallager Laboratory for Information and Decision Systems Massachusetts Institute of Technology IEEE INFOCOM 1992.
Cloud Control with Distributed Rate Limiting Raghaven et all Presented by: Brian Card CS Fall Kinicki 1.
Managing Cloud Resources: Distributed Rate Limiting Alex C. Snoeren Kevin Webb, Bhanu Chandra Vattikonda, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran,
Estimating TCP Latency Approximately with Passive Measurements Sriharsha Gangam, Jaideep Chandrashekar, Ítalo Cunha, Jim Kurose.
ElasticTree: Saving Energy in Data Center Networks Brandon Heller, Srini Seetharaman, Priya Mahadevan, Yiannis Yiakoumis, Puneed Sharma, Sujata Banerjee,
Playback-buffer Equalization For Streaming Media Using Stateless Transport Prioritization By Wai-tian Tan, Weidong Cui and John G. Apostolopoulos Presented.
SEEDING CLOUD-BASED SERVICES: DISTRIBUTED RATE LIMITING (DRL) Kevin Webb, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli University of Calif, Berkeley and Lawrence Berkeley National Laboratory SIGCOMM.
A Flexible Model for Resource Management in Virtual Private Networks Presenter: Huang, Rigao Kang, Yuefang.
Measurements of Congestion Responsiveness of Windows Streaming Media (WSM) Presented By:- Ashish Gupta.
1 Routing and Scheduling in Web Server Clusters. 2 Reference The State of the Art in Locally Distributed Web-server Systems Valeria Cardellini, Emiliano.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.
An Implementation and Experimental Study of the eXplicit Control Protocol (XCP) Yongguang Zhang and Tom Henderson INFOCOMM 2005 Presenter - Bob Kinicki.
1 Modeling and Emulation of Internet Paths Pramod Sanaga, Jonathon Duerig, Robert Ricci, Jay Lepreau University of Utah.
Controlling High- Bandwidth Flows at the Congested Router Ratul Mahajan, Sally Floyd, David Wetherall AT&T Center for Internet Research at ICSI (ACIRI)
1 A General Auction-Based Architecture for Resource Allocation Weidong Cui, Matthew C. Caesar, and Randy H. Katz EECS, UC Berkeley {wdc, mccaesar,
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Congestion Control in Distributed Media Streaming Lin Ma Wei Tsang Ooi School of Computing National University of Singapore IEEE INFOCOM 2007.
1 End-to-End Detection of Shared Bottlenecks Sridhar Machiraju and Weidong Cui Sahara Winter Retreat 2003.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
Rethinking Internet Traffic Management: From Multiple Decompositions to a Practical Protocol Jiayue He Princeton University Joint work with Martin Suchara,
Cloud Control with Distributed Rate Limiting Barath Raghavan, Kashi Vishwanath Sriram Ramabhadran, Kenneth Yocum & Alex C.Snoeren Offence: Alex Kiaie &
1 Manpreet Singh, Prashant Pradhan* and Paul Francis * MPAT: Aggregate TCP Congestion Management as a Building Block for Internet QoS.
Transport Level Protocol Performance Evaluation for Bulk Data Transfers Matei Ripeanu The University of Chicago Abstract:
Routers with Small Buffers Yashar Ganjali High Performance Networking Group Stanford University
Tradeoffs in CDN Designs for Throughput Oriented Traffic Minlan Yu University of Southern California 1 Joint work with Wenjie Jiang, Haoyuan Li, and Ion.
Junxian Huang 1 Feng Qian 2 Yihua Guo 1 Yuanyuan Zhou 1 Qiang Xu 1 Z. Morley Mao 1 Subhabrata Sen 2 Oliver Spatscheck 2 1 University of Michigan 2 AT&T.
Receiver-driven Layered Multicast Paper by- Steven McCanne, Van Jacobson and Martin Vetterli – ACM SIGCOMM 1996 Presented By – Manoj Sivakumar.
Practical TDMA for Datacenter Ethernet
Multicast Congestion Control in the Internet: Fairness and Scalability
Network Sharing Issues Lecture 15 Aditya Akella. Is this the biggest problem in cloud resource allocation? Why? Why not? How does the problem differ wrt.
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,
Distributed Multimedia March 19, Distributed Multimedia What is Distributed Multimedia?  Large quantities of distributed data  Typically streamed.
David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:
ECE 4450:427/527 - Computer Networks Spring 2015 Dr. Nghi Tran Department of Electrical & Computer Engineering Lecture 2: Overview of Computer Network.
1 Lecture 14 High-speed TCP connections Wraparound Keeping the pipeline full Estimating RTT Fairness of TCP congestion control Internet resource allocation.
Congestion Control - Supplementary Slides are adapted on Jean Walrand’s Slides.
1 On Class-based Isolation of UDP, Short-lived and Long-lived TCP Flows by Selma Yilmaz Ibrahim Matta Computer Science Department Boston University.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.
1 IEEE Meeting July 19, 2006 Raj Jain Modeling of BCN V2.0 Jinjing Jiang and Raj Jain Washington University in Saint Louis Saint Louis, MO
TCP Trunking: Design, Implementation and Performance H.T. Kung and S. Y. Wang.
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jiayue He, Rui Zhang-Shen, Ying Li, Cheng-Yen Lee, Jennifer Rexford, and Mung.
CS640: Introduction to Computer Networks Aditya Akella Lecture 20 - Queuing and Basics of QoS.
Deadline-based Resource Management for Information- Centric Networks Somaya Arianfar, Pasi Sarolahti, Jörg Ott Aalto University, Department of Communications.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
L Subramanian*, I Stoica*, H Balakrishnan +, R Katz* *UC Berkeley, MIT + USENIX NSDI’04, 2004 Presented by Alok Rakkhit, Ionut Trestian.
XCP: eXplicit Control Protocol Dina Katabi MIT Lab for Computer Science
1 Sheer volume and dynamic nature of video stresses network resources PIE: A lightweight latency control to address the buffer problem issue Rong Pan,
1 Three ways to (ab)use Multipath Congestion Control Costin Raiciu University Politehnica of Bucharest.
OverQos: An Overlay based Architecture for Enhancing Internet Qos L Subramanian*, I Stoica*, H Balakrishnan +, R Katz* *UC Berkeley, MIT + USENIX NSDI’04,
Accelerating Peer-to-Peer Networks for Video Streaming
Confluent vs. Splittable Flows
Improving Datacenter Performance and Robustness with Multipath TCP
TCP Congestion Control
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
ECE 4450:427/527 - Computer Networks Spring 2017
2019/5/13 A Weighted ECMP Load Balancing Scheme for Data Centers Using P4 Switches Presenter:Hung-Yen Wang Authors:Peng Wang, George Trimponias, Hong Xu,
Presentation transcript:

Cloud Control with Distributed Rate Limiting Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren University of California, San Diego

Centralized network services Hosting with a single physical presence –However, clients are across the Internet

Running on a cloud Resources and clients are across the world Services combine these distributed resources 1 Gbps

Key challenge We want to control distributed resources as if they were centralized

Ideal: Emulate a single limiter Make distributed feel centralized –Packets should experience same limiter behavior S S S D D D 0 ms Limiters

Distributed Rate Limiting (DRL) Achieve functionally equivalent behavior to a central limiter Global Random Drop Flow Proportional Share Packet-level (general) Flow-level (TCP specific) Global Token Bucket 123

Distributed Rate Limiting tradeoffs Accuracy (how close to K Mbps is delivered, flow rate fairness) + Responsiveness (how quickly demand shifts are accommodated) Vs. Communication Efficiency (how much and often rate limiters must communicate)

Limiter 1 DRL Architecture Limiter 2 Limiter 3 Limiter 4 Gossip Estimate local demand Estimate interval timer Set allocation Global demand Enforce limit Packet arrival

Token Buckets Token bucket, fill rate K Mbps Packe t

Demand info (bytes/sec) Building a Global Token Bucket Limiter 1Limiter 2

Baseline experiment Limiter 1 3 TCP flows S D Limiter 2 7 TCP flows S D Single token bucket 10 TCP flows S D

Global Token Bucket (GTB) Single token bucket Global token bucket 7 TCP flows 3 TCP flows 10 TCP flows Problem: GTB requires near-instantaneous arrival info (50ms estimate interval)

Global Random Drop (GRD) 5 Mbps (limit) 4 Mbps (global arrival rate) Case 1: Below global limit, forward packet Limiters send, collect global rate info from others

Global Random Drop (GRD) 5 Mbps (limit) 6 Mbps (global arrival rate) Case 2: Above global limit, drop with probability: Excess Global arrival rate Same at all limiters 1616 =

GRD in baseline experiment Single token bucket Global random drop 7 TCP flows 3 TCP flows 10 TCP flows (50ms estimate interval) Delivers flow behavior similar to a central limiter

GRD with flow join (50ms estimate interval) Flow 1 joins at limiter 1 Flow 2 joins at limiter 2Flow 3 joins at limiter 3

Flow Proportional Share (FPS) Limiter 1 3 TCP flows S D Limiter 2 7 TCP flows S D

Flow Proportional Share (FPS) Limiter 1Limiter 2 “3 flows” “7 flows” Goal: Provide inter-flow fairness for TCP flows Local token-bucket enforcement

Estimating TCP demand Limiter 1 D Limiter 2 3 TCP flows S D 1 TCP flow S S

Estimating TCP demand Local token rate (limit) = 10 Mbps Flow A = 5 Mbps Flow B = 5 Mbps Flow count = 2 flows

Estimating TCP demand Limiter 1 1 TCP flow S D Limiter 2 3 TCP flows S D S 1 TCP flow

Key insight: Use a TCP flow’s rate to infer demand Estimating skewed TCP demand Local token rate (limit) = 10 Mbps Flow A = 2 Mbps Flow B = 8 Mbps Flow count ≠ demand Bottlenecked elsewhere

Estimating skewed TCP demand Local token rate (limit) = 10 Mbps Flow A = 2 Mbps Flow B = 8 Mbps Local Limit Largest Flow’s Rate 10 8 = Bottlenecked elsewhere = 1.25 flows

10 Mbps x Flow Proportional Share (FPS) Global limit = 10 Mbps Set local token rate = = 3.33 Mbps Global limit x local flow count Total flow count =

Under-utilized limiters Limiter 1 D S 1 TCP flow S Set local limit equal to actual usage Wasted rate (limiter returns to full utilization)

Flow Proportional Share (FPS) (500ms estimate interval)

Additional issues What if a limiter has no flows and one arrives? What about bottlenecked traffic? What about varied RTT flows? What about short-lived vs. long-lived flows? Experimental evaluation in the paper –Evaluated on a testbed and over Planetlab

Cloud control on Planetlab Apache Web servers on 10 Planetlab nodes 5 Mbps aggregate limit Shift load over time from 10 nodes to 4 nodes 5 Mbps

Static rate limiting Demands at 10 apache servers on Planetlab Demand shifts to just 4 nodes Wasted capacity

FPS (top) vs. Static limiting (bottom)

Conclusions Protocol agnostic limiting (extra cost) –Requires shorter estimate intervals Fine-grained packet arrival info not required –For TCP, flow-level granularity is sufficient Many avenues left to explore –Inter-service limits, other resources (e.g. CPU)

Questions!

FPS state diagram Case 2: Underutilized Set local limit to actual usage Case 1: Fully utilized Flows start/end, Network congestion, Bottlenecked flows

DRL cheat-proofness Conservation of rate among limiters for FPS EWMA compensates for past cheating with higher drop rates in the future To cheat GRD, forever increase the demand Authenticated inter-limiter communication assumed Difficult to quickly move traffic demands

DRL applications Cloud-based services (e.g. Amazon’s EC2/S3) Content-distribution networks (e.g. Akamai) Distributed VPN limiting Internet testbeds (e.g. Planetlab) Overlay network service limiting

DRL ≠ QoS DRL provides: Fixed, aggregate rate limit No service classes No bandwidth guarantees No reservations No explicit fairness; implicit (TCP) fairness

DRL provisioning Providers could ensure that the limit K Mbps is available at all locations; wasteful We expect it to be like current practice: statistical multiplexing Even today, can’t guarantee bandwidth to all destinations with a single pipe

Experimental setup Modelnet network emulation on testbed 40ms inter-limiter RTT Emulated 100 Mbps links No explicit inter-limiter loss Ran limiters across 7 testbed machines On Linux 2.6; packet capture via iptables ipq

Short flows with FPS 2 limiters, 10 bulk TCP vs. Web, 10Mbps limit Web traffic based on CAIDA OC-48 trace Loaded via httperf, Poisson arrivals (μ = 15) CTBGRDFPS Bulk rate Fairness (Jain’s) Web rate [0, 5K) [5K, 50K) [50K, 500K) [500K, ∞)

Bottlenecked flows with FPS Baseline experiment: 3-7 flow split, 10 Mbps At 15s, 7 flow aggregate limited to 2Mbps upstream At 31s, 1 unbottlenecked flow arrives joins 7 flows

RTT heterogeneity with FPS FPS doesn’t reproduce RTT unfairness 7 flows (10 ms RTT) vs. 3 flows (100 ms RTT) CTBGRDFPS Short RTT (Mbps) (stddev) Long RTT (Mbps) (stddev)

Scaling discussion How do various parameters affect scaling? Number of flows present: per-limiter smoothing Number of limiters: ~linear overhead increase Estimate interval: limits responsiveness Inter-limiter latency: limits responsiveness Size of rate limit delivered: orthogonal

Comm. Fabric requirements Up-to-date information about global rate Many designs are possible: Full-mesh (all-pairs) Gossip (contact k peers per round) Tree-based …

Gossip protocol Based on protocol by Kempe et al. (FOCS 2003) Send packets to k peers per estimate interval Each update contains 2 floats: a value and a weight To send an update to k peers, divide value and weight each by (k+1), store result locally, send k updates out To merge an update, add the update’s value and weight to the locally known value and weight