Presentation is loading. Please wait.

Presentation is loading. Please wait.

Managing Cloud Resources: Distributed Rate Limiting Alex C. Snoeren Kevin Webb, Bhanu Chandra Vattikonda, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran,

Similar presentations


Presentation on theme: "Managing Cloud Resources: Distributed Rate Limiting Alex C. Snoeren Kevin Webb, Bhanu Chandra Vattikonda, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran,"— Presentation transcript:

1 Managing Cloud Resources: Distributed Rate Limiting Alex C. Snoeren Kevin Webb, Bhanu Chandra Vattikonda, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, and Kenneth Yocum Building and Programming the Cloud Workshop 13 January 2010

2 2 l Hosting with a single physical presence u However, clients are across the Internet Centralized Internet services Mysore-Park Cloud Workshop – 13 January 2010

3 3 Cloud-based services l Resources and clients distributed across the world u Often incorporates resources from multiple providers Windows Live Mysore-Park Cloud Workshop – 13 January 2010

4 4 Resources in the Cloud l Distributed resource consumption u Clients consume resources at multiple sites u Metered billing is state-of-the-art u Service “punished” for popularity »Those unable to pay are disconnected u No control of resources used to serve increased demand l Overprovision and pray u Application designers typically cannot describe needs u Individual service bottlenecks varied but severe »IOps, network bandwidth, CPU, RAM, etc. »Need a way to balance resource demand Mysore-Park Cloud Workshop – 13 January 2010

5 Two lynchpins for success l Need a way to control and manage distributed resources as if they were centralized u All current models from OS scheduling and provisioning literature assume full knowledge and absolute control u (This talk focuses specifically on network bandwidth) l Must be able to efficiently support rapidly evolving application demand u Balance the resource needs to hardware realization automatically without application designer input u (Another talk if you’re interested) Mysore-Park Cloud Workshop – 13 January 20105

6 6 S S S D D D 0 ms Limiters Ideal: Emulate a single limiter l Make distributed feel centralized u Packets should experience the same limiter behavior Mysore-Park Cloud Workshop – 13 January 2010

7 7 Accuracy (how close to K Mbps is delivered, flow rate fairness) + Responsiveness (how quickly demand shifts are accommodated) Vs. Communication Efficiency (how much and often rate limiters must communicate) Engineering tradeoffs Mysore-Park Cloud Workshop – 13 January 2010

8 8 Limiter 1 Limiter 2 Limiter 3 Limiter 4 Gossip Estimate local demand Estimate interval timer Set allocation Global demand Enforce limit Packet arrival An initial architecture Mysore-Park Cloud Workshop – 13 January 2010

9 9 Token bucket, fill rate K Mbps Packet Token bucket limiters Mysore-Park Cloud Workshop – 13 January 2010

10 10 Demand info (bytes/sec) Limiter 1Limiter 2 A global token bucket (GTB)? Mysore-Park Cloud Workshop – 13 January 2010

11 11 Limiter 1 3 TCP flows S D Limiter 2 7 TCP flows SD Single token bucket 10 TCP flows S D A baseline experiment Mysore-Park Cloud Workshop – 13 January 2010

12 12 Single token bucket Global token bucket 7 TCP flows 3 TCP flows 10 TCP flows Problem: GTB requires near-instantaneous arrival info GTB performance Mysore-Park Cloud Workshop – 13 January 2010

13 13 5 Mbps (limit) 4 Mbps (global arrival rate) Case 1: Below global limit, forward packet Limiters send, collect global rate info from others Take 2: Global Random Drop Mysore-Park Cloud Workshop – 13 January 2010

14 14 5 Mbps (limit) 6 Mbps (global arrival rate) Case 2: Above global limit, drop with probability: Excess / Global arrival rate = 1/6 Same at all limiters Global Random Drop (GRD) Mysore-Park Cloud Workshop – 13 January 2010

15 15 7 TCP flows 3 TCP flows 10 TCP flows Delivers flow behavior similar to a central limiter GRD baseline performance Single token bucket Global token bucket Mysore-Park Cloud Workshop – 13 January 2010

16 GRD under dynamic arrivals 16Mysore-Park Cloud Workshop – 13 January 2010 (50-ms estimate interval)

17 17 Limiter 1 3 TCP flows S D Limiter 2 7 TCP flows S D Returning to our baseline Mysore-Park Cloud Workshop – 13 January 2010

18 18 “3 flows” “7 flows” Goal: Provide inter-flow fairness for TCP flows Local token-bucket enforcement Basic idea: flow counting Limiter 1Limiter 2 Mysore-Park Cloud Workshop – 13 January 2010

19 19 Local token rate (limit) = 10 Mbps Flow A = 5 Mbps Flow B = 5 Mbps Flow count = 2 flows Estimating TCP demand 1 TCP flow S S Mysore-Park Cloud Workshop – 13 January 2010

20 FPS under dynamic arrivals 20Mysore-Park Cloud Workshop – 13 January 2010 (500-ms estimate interval)

21 21 Comparing FPS to GRD l Both are responsive and provide similar utilization l GRD requires accurate estimates of the global rate at all limiters. GRD (50-ms est. int.) FPS (500-ms est. int.) Mysore-Park Cloud Workshop – 13 January 2010

22 22 Estimating skewed demand Limiter 1 D Limiter 2 3 TCP flows S D 1 TCP flow S S Mysore-Park Cloud Workshop – 13 January 2010

23 23 Key insight: Use a TCP flow’s rate to infer demand Local token rate (limit) = 10 Mbps Flow A = 8 Mbps Flow B = 2 Mbps Flow count ≠ demand Bottlenecked elsewhere Estimating skewed demand Mysore-Park Cloud Workshop – 13 January 2010

24 24 Local token rate (limit) = 10 Mbps Flow A = 8 Mbps Flow B = 2 Mbps Bottlenecked elsewhere Estimating skewed demand 10 8 Local Limit Largest Flow’s Rate = = 1.25 flows Mysore-Park Cloud Workshop – 13 January 2010

25 25 3 flows Limiter 2 10 Mbps x 1.25 1.25 + 3 Global limit = 10 Mbps 1.25 flows Limiter 1 Set local token rate = = 2.94 Mbps Global limit x local flow count Total flow count = FPS example Mysore-Park Cloud Workshop – 13 January 2010

26 FPS bottleneck example Mysore-Park Cloud Workshop – 13 January 201026 l Initially 3:7 split between 10 un-bottlenecked flows l At 25s, 7-flow aggregate bottlenecked to 2 Mbps l At 45s, un-bottlenecked flow arrives: 3:1 for 8 Mbps

27 Real world constraints l Resources spent tracking usage is pure overhead u Efficient implementation (<3% CPU, sample & hold) u Modest communication budget (<1% bandwidth) l Control channel is slow and lossy u Need to extend gossip protocols to tolerate loss u An interesting research problem on its own… l The nodes themselves may fail or partition u In an asynchronous system, you cannot tell the difference u Need to have a mechanism that deals gracefully with both Mysore-Park Cloud Workshop – 13 January 201027

28 Robust control communication Mysore-Park Cloud Workshop – 13 January 201028 l 7 Limiters enforcing 10 Mbps limit l Demand fluctuates every 5 sec between 1-100 flows l Varying loss on the control channel

29 Handling partitions Mysore-Park Cloud Workshop – 13 January 201029 l Failsafe operation: each disconnected group k/n l Ideally: Bank-o-mat problem (credit/debit scheme) l Challege: group membership with asymmetric partitions

30 30 5 Mbps Following PlanetLab demand l Apache Web servers on 10 PlanetLab nodes u 5-Mbps aggregate limit u Shift load over time from 10 nodes to 4 Mysore-Park Cloud Workshop – 13 January 2010

31 31 Demands at 10 apache servers on Planetlab Demand shifts to just 4 nodes Wasted capacity 31Mysore-Park Cloud Workshop – 13 January 2010 Current limiting options

32 32 Applying FPS on PlanetLab 32Mysore-Park Cloud Workshop – 13 January 2010

33 Hierarchical limiting Mysore-Park Cloud Workshop – 13 January 201033

34 A sample use case Mysore-Park Cloud Workshop – 13 January 201034 l T 0: A: 5 flows at L1 l T 55: A: 5 flows at L2 l T 110: B: 5 flows at L1 l T 165: B: 5 flows at L2

35 Worldwide flow join Mysore-Park Cloud Workshop – 13 January 201035 l 8 nodes split between UCSD and Polish Telecom u 5 Mbps aggregate limit u A new flow arrives at each limiter every 10 seconds

36 Worldwide demand shift Mysore-Park Cloud Workshop – 13 January 201036 l Same demand-shift experiment as before u At 50 sec, Polish Telecom demand disappears u Reappears at 90 sec.

37 37 Where to go from here l Need to “let go” of full control, make decisions with only a “cloudy” view of actual resource consumption u Distinguish between what you know and what you don’t know u Operate efficiently when you know you know. u Have failsafe options when you know you don’t. l Moreover, we cannot rely upon application/service designers to understand their resource demands u The system needs to dynamically adjust to shifts u We’ve started to manage the demand equation u We’re now focusing on the supply side: custom-tailored resource provisioning. Mysore-Park Cloud Workshop – 13 January 2010

38


Download ppt "Managing Cloud Resources: Distributed Rate Limiting Alex C. Snoeren Kevin Webb, Bhanu Chandra Vattikonda, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran,"

Similar presentations


Ads by Google