Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud Control with Distributed Rate Limiting Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren University of California,

Similar presentations


Presentation on theme: "Cloud Control with Distributed Rate Limiting Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren University of California,"— Presentation transcript:

1

2 Cloud Control with Distributed Rate Limiting Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren University of California, San Diego

3 Centralized network services Hosting with a single physical presence –However, clients are across the Internet

4 Running on a cloud Resources and clients are across the world Services combine these distributed resources 1 Gbps

5 Key challenge We want to control distributed resources as if they were centralized

6 Ideal: Emulate a single limiter Make distributed feel centralized –Packets should experience same limiter behavior S S S D D D 0 ms Limiters

7 Distributed Rate Limiting (DRL) Achieve functionally equivalent behavior to a central limiter Global Random Drop Flow Proportional Share Packet-level (general) Flow-level (TCP specific) Global Token Bucket 123

8 Distributed Rate Limiting tradeoffs Accuracy (how close to K Mbps is delivered, flow rate fairness) + Responsiveness (how quickly demand shifts are accommodated) Vs. Communication Efficiency (how much and often rate limiters must communicate)

9 Limiter 1 DRL Architecture Limiter 2 Limiter 3 Limiter 4 Gossip Estimate local demand Estimate interval timer Set allocation Global demand Enforce limit Packet arrival

10 Token Buckets Token bucket, fill rate K Mbps Packe t

11 Demand info (bytes/sec) Building a Global Token Bucket Limiter 1Limiter 2

12 Baseline experiment Limiter 1 3 TCP flows S D Limiter 2 7 TCP flows S D Single token bucket 10 TCP flows S D

13 Global Token Bucket (GTB) Single token bucket Global token bucket 7 TCP flows 3 TCP flows 10 TCP flows Problem: GTB requires near-instantaneous arrival info (50ms estimate interval)

14 Global Random Drop (GRD) 5 Mbps (limit) 4 Mbps (global arrival rate) Case 1: Below global limit, forward packet Limiters send, collect global rate info from others

15 Global Random Drop (GRD) 5 Mbps (limit) 6 Mbps (global arrival rate) Case 2: Above global limit, drop with probability: Excess Global arrival rate Same at all limiters 1616 =

16 GRD in baseline experiment Single token bucket Global random drop 7 TCP flows 3 TCP flows 10 TCP flows (50ms estimate interval) Delivers flow behavior similar to a central limiter

17 GRD with flow join (50ms estimate interval) Flow 1 joins at limiter 1 Flow 2 joins at limiter 2Flow 3 joins at limiter 3

18 Flow Proportional Share (FPS) Limiter 1 3 TCP flows S D Limiter 2 7 TCP flows S D

19 Flow Proportional Share (FPS) Limiter 1Limiter 2 “3 flows” “7 flows” Goal: Provide inter-flow fairness for TCP flows Local token-bucket enforcement

20 Estimating TCP demand Limiter 1 D Limiter 2 3 TCP flows S D 1 TCP flow S S

21 Estimating TCP demand Local token rate (limit) = 10 Mbps Flow A = 5 Mbps Flow B = 5 Mbps Flow count = 2 flows

22 Estimating TCP demand Limiter 1 1 TCP flow S D Limiter 2 3 TCP flows S D S 1 TCP flow

23 Key insight: Use a TCP flow’s rate to infer demand Estimating skewed TCP demand Local token rate (limit) = 10 Mbps Flow A = 2 Mbps Flow B = 8 Mbps Flow count ≠ demand Bottlenecked elsewhere

24 Estimating skewed TCP demand Local token rate (limit) = 10 Mbps Flow A = 2 Mbps Flow B = 8 Mbps Local Limit Largest Flow’s Rate 10 8 = Bottlenecked elsewhere = 1.25 flows

25 10 Mbps x 1.25 1.25 + 2.50 Flow Proportional Share (FPS) Global limit = 10 Mbps Set local token rate = = 3.33 Mbps Global limit x local flow count Total flow count =

26 Under-utilized limiters Limiter 1 D S 1 TCP flow S Set local limit equal to actual usage Wasted rate (limiter returns to full utilization)

27 Flow Proportional Share (FPS) (500ms estimate interval)

28 Additional issues What if a limiter has no flows and one arrives? What about bottlenecked traffic? What about varied RTT flows? What about short-lived vs. long-lived flows? Experimental evaluation in the paper –Evaluated on a testbed and over Planetlab

29 Cloud control on Planetlab Apache Web servers on 10 Planetlab nodes 5 Mbps aggregate limit Shift load over time from 10 nodes to 4 nodes 5 Mbps

30 Static rate limiting Demands at 10 apache servers on Planetlab Demand shifts to just 4 nodes Wasted capacity

31 FPS (top) vs. Static limiting (bottom)

32 Conclusions Protocol agnostic limiting (extra cost) –Requires shorter estimate intervals Fine-grained packet arrival info not required –For TCP, flow-level granularity is sufficient Many avenues left to explore –Inter-service limits, other resources (e.g. CPU)

33 Questions!

34 FPS state diagram Case 2: Underutilized Set local limit to actual usage Case 1: Fully utilized Flows start/end, Network congestion, Bottlenecked flows

35 DRL cheat-proofness Conservation of rate among limiters for FPS EWMA compensates for past cheating with higher drop rates in the future To cheat GRD, forever increase the demand Authenticated inter-limiter communication assumed Difficult to quickly move traffic demands

36 DRL applications Cloud-based services (e.g. Amazon’s EC2/S3) Content-distribution networks (e.g. Akamai) Distributed VPN limiting Internet testbeds (e.g. Planetlab) Overlay network service limiting

37 DRL ≠ QoS DRL provides: Fixed, aggregate rate limit No service classes No bandwidth guarantees No reservations No explicit fairness; implicit (TCP) fairness

38 DRL provisioning Providers could ensure that the limit K Mbps is available at all locations; wasteful We expect it to be like current practice: statistical multiplexing Even today, can’t guarantee bandwidth to all destinations with a single pipe

39 Experimental setup Modelnet network emulation on testbed 40ms inter-limiter RTT Emulated 100 Mbps links No explicit inter-limiter loss Ran limiters across 7 testbed machines On Linux 2.6; packet capture via iptables ipq

40 Short flows with FPS 2 limiters, 10 bulk TCP vs. Web, 10Mbps limit Web traffic based on CAIDA OC-48 trace Loaded via httperf, Poisson arrivals (μ = 15) CTBGRDFPS Bulk rate6900.907257.876989.76 Fairness (Jain’s)0.9710.9970.962 Web rate [0, 5K)28.1725.8425.71 [5K, 50K)276.18342.96335.80 [50K, 500K)472.09612.08571.40 [500K, ∞)695.40751.98765.26

41 Bottlenecked flows with FPS Baseline experiment: 3-7 flow split, 10 Mbps At 15s, 7 flow aggregate limited to 2Mbps upstream At 31s, 1 unbottlenecked flow arrives joins 7 flows

42 RTT heterogeneity with FPS FPS doesn’t reproduce RTT unfairness 7 flows (10 ms RTT) vs. 3 flows (100 ms RTT) CTBGRDFPS Short RTT (Mbps) 1.411.350.92 (stddev)0.160.710.15 Long RTT (Mbps) 0.100.160.57 (stddev)0.010.030.05

43 Scaling discussion How do various parameters affect scaling? Number of flows present: per-limiter smoothing Number of limiters: ~linear overhead increase Estimate interval: limits responsiveness Inter-limiter latency: limits responsiveness Size of rate limit delivered: orthogonal

44 Comm. Fabric requirements Up-to-date information about global rate Many designs are possible: Full-mesh (all-pairs) Gossip (contact k peers per round) Tree-based …

45 Gossip protocol Based on protocol by Kempe et al. (FOCS 2003) Send packets to k peers per estimate interval Each update contains 2 floats: a value and a weight To send an update to k peers, divide value and weight each by (k+1), store result locally, send k updates out To merge an update, add the update’s value and weight to the locally known value and weight


Download ppt "Cloud Control with Distributed Rate Limiting Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren University of California,"

Similar presentations


Ads by Google