Introduction Distributed computing today: software as a service Google Documents Groove Office Windows Live Benefit for users: Easier management Benefit for service provider: Leverage widely distributed computing infrastructures
Introduction Barrier: Loss of cost control (how to bill?) Amazon’s EC2: metered pricing (but customers prefer flat fee) Flat fee => provider must be able to limit consumption to control costs (but difficult to do in a distributed environment) Focus: Control aggregate network bandwidth, distributed rate limiting (DRL)
Introduction Goal: Allow set of distributed traffic rate limiters to collaborate to subject a class of network traffic (e.g. one service) to single, aggregate global limit Resource provider, 10 hosting centers, limit 100 Mbps, current options: 100 Mbps each hosting center (might all use this limit simultaneously => 1 Gbps) 10 Mbps each center (efficient use unlikely unless traffic perfectly balanced)
Introduction Key challenge: Flows arriving at different limiters should achieve same rates as if they were all traversing a single shared rate limiter We present illusion of passing all traffic through single token-bucket rate limiter Key challenge: Measuring demand of aggregate at each limiter, apportioning capacity in proportion to that demand
Classes of Clouds Limiting cloud-based services Cloud-based services: Clients see unified service, transparent of independent physical sites DRL provides providers ability to control network bandwidth as if sourced from single site => no migration necessary, bandwidth gravitates towards sites with most demand
Classes of Clouds Content distribution networks Content replication of third-party web sites at numerous geographically diverse locations, improve performance, scalability, reliability With DRL, CDNs can set per-customer limits based on service-level agreements Protective mechanism to rate limit nefarious users
Classes of Clouds Internet testbeds Planetlab currently has bandwidth limits at each individual site, cannot do across multiple machines DRL provides effective limits for Planetlab service distributed across North America
Classes of Clouds Assumptions and scope: No QoS guarantees Can identify traffic belonging to particular service Discussion in single service without loss of generality
Limiter Design Peer-to-peer limiter architecture Tasks: Estimation Communication Allocation Periodically measure traffic arrival rate, communicate to other limiters, receive rates from other limiters, computes estimate of global rate, determine how to service local demand to enforce global rate
Limiter Design Estimation: compute average arrival rate over fixed time intervals, use exponentially- weighted moving average (EWMA) filter to smooth out short-term fluctuations (settings determined later) At the end of each estimate interval, local changes merged with global estimate, and each limiter disseminates local changes to other limiters – gossip protocol used with UDP
Limiter Design Allocation Global token bucket (GTB) Global random drop (GRD) Flow proportional share (FPS)
Token Bucket Common trick used to control amount of data injected into network, allowing bursts There is a bucket that can hold limited number of tokens Tokens are added to bucket at some rate If token comes when bucket is full, it is discarded When packet arrives, some number of tokens removed, packet is sent to network Packet arrives when bucket is empty => dropped
Limiter Design Global token bucket Emulate centralized token bucket Each limiter’s token bucket refreshes at global rate At every interval, local rate computed and sent, obtain local rates from other limits, sum, removes tokens at this global rate Highly sensitive to stale observations, impractical at large scale or in lossy networks
Limiter Design Global random drop Instead of emulating central limiter, emulate drop rate of centralized case Same as before, collect demand from other limiters, then compute drop probability – proportional to (demand-limit) Is better over longer periods of time, does not capture short-term effects
Evaluation Methodology 3 metrics: Utilization, flow fairness, responsiveness Basic goal: hold aggregate throughput across all limiters below global limit Achieve fairness equal to or better than that of centralized token bucket limiter
Evaluation Methodology Evaluation on emulation testbed with ModelNet Simple mesh topology to connect limiters Each source and sink pair routed through single limiter 100 Mbps links
Evaluation Flow Dynamics FPS only requires updates as flows arrive depart, or change their behavior Baseline Loaded Limiters with 10 unbottledneck TCP flows Chose a 3-7 skew Aggregate apportioned between limiters in about to 3-7 split.
Evaluation Mixed TCP flow round-trip times FPS provides a higher degree of fairness between RTT’s Traffic Distributions Evaluated the effects of varying traffic demands Bottlenecked TCP flows Have the ability of FPS to correctly allocate rate across aggregates of bottlenecked and unbottlenecked flows.
Conclusion Demands on traditional Web-hosting and ISP’s are likely to shift Our experiments show that naïve implementations are unable to deliver adequate levels of fairness. Our results demonstrate that it’s possible to recreate the flow behavior that end users expect from a centralized rate limiter.