Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.

Similar presentations


Presentation on theme: "Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department."— Presentation transcript:

1 Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, kubitron}@CS.Berkeley.EDU EECS Department UC Berkeley

2 Motivation Scenario data plane network plane data source Web content server CDN server client replica

3 Goal and Challenges Dynamically choose the number and placement of replicas while satisfying clients’ QoS and servers’ capacity constraints Good performance for update dissemination –Delay- Bandwidth consumption Without global network topology knowledge Scalability for millions of objects, clients and servers Provide content distribution to clients with good Quality of Service (QoS) while retaining efficient and balanced resource consumption of the underlying infrastructure

4 Outline Goal and Challenges Previous Work Our Solutions: Dissemination Tree Peer-to-peer Location Service Replica Placement and Tree Construction Evaluation Conclusion and Future Work

5 Previous Work Focused on static replica placement –Clients’ distributions and access patterns known in advance –Assume global IP network topology DNS-redirection based CDNs highly inefficient –Centralized CDN name server cannot record replica locations No inter-domain IP multicast Application-level multicast (ALM) unscalable –Root maintains states for all children or handle all “join” requests

6 Solution: Dissemination Tree Dynamic replica placement use close-to-minimum number of replicas to satisfy QoS and capacity constraints with local network topology data plane network plane data source root server client replica always update adaptive coherence cache Adaptive cache coherence for efficient coherence notification Tapestry mesh Use Tapestry location service to improve the scalability & locality

7 Peer-to-peer Routing and Location Services Properties Needed by d-tree –Distributed, scalable location with guaranteed success –Search with locality P2P Routing and Location Services: –CAN, Chord, Pastry, Tapestry, etc. Http://www.cs.berkeley.edu/~ravenben/tapestry

8 Integrated Replica Placement and d-tree Construction Dynamic Replica Placement + Application-level Multicast –Naïve approach- Smart approach Static Replica Placement + IP Multicast –Modeled as capacitated facility location problem –Design a greedy algorithm with logN approximation –Optimal case for comparison Soft-state Tree Maintenance

9 parent candidate data plane network plane c s surrogate Tapestry overlay path Dynamic Replica Placement: naïve

10 data plane network plane c s surrogate Tapestry overlay path first placement choice parent candidate Dynamic Replica Placement: naïve

11 Dynamic Replica Placement: smart parent candidates Aggressive search data plane network plane c s parent sibling server child surrogate Tapestry overlay path client child Greedy load distribution

12 Dynamic Replica Placement: smart Aggressive search Lazy placement Greedy load distribution data plane network plane c s parent sibling server child surrogate Tapestry overlay path client child first placement choice parent candidates

13 Evaluation Methodology Network Topology –5000-node network with GT-ITM transit-stub model –500 d-tree server nodes, 4500 clients join in random order Network Simulator –NS-like packet-level priority-queue based event simulator Dissemination Tree Server Deployment –Random d-tree –Backbone d-tree (choose backbone routers and subnet gateways first) Constraints –50 ms latency bound and 200 clients/server load bound

14 Performance of Dynamic Replica Placement Compare Four Approaches –Overlay dynamic naïve placement (dynamic_naïve) –Overlay dynamic smart placement (dynamic_smart) –Static placement on overlay network (overlay_static) –Static placement on IP network (IP_static) Metrics –Number of replicas deployed and load distribution –Dissemination multicast performance –Tree construction traffic

15 Number of Replicas Deployed and Load Distribution Overlay_smart uses much less replicas than overlay_naïve and very close to IP_static Overlay_smart has better load distribution than od_naïve, overlay_static and very close to IP_static

16 Multicast Performance 85% of overlay_smart Relative Delay Penalty (RDP) less than 4 Bandwidth consumed by overlay_smart is very close to IP_static and much less than overlay_naive

17 Tree Construction Traffic Including “join” requests, “ping” messages, replica placement and parent/child registration Overlay_smart consumes three to four times of traffic than overlay_naïve, and the traffic of overlay_naïve is quite close to IP_static Far less frequent event than update dissemination

18 Conclusions and Future Work Dissemination Tree: dynamic Content Distribution Network on top of a peer-to-peer location service –Dynamic replica placement satisfy QoS and capacity constraints and self-organize into an app-level multicast tree –Use Tapestry to improve the scalability and locality Simulation Results Show –Close to optimal number of replicas, good load distribution, low multicast delay and bandwidth penalty at the price of reasonable construction traffic Future Work –Evaluate with more diverse topologies and workload –Dynamic replica deletion/migration to adapt to the shift of users’ interests

19 Routing in Detail 5712 0880 3210 7510 4510 Neighbor Map For “5712” (Octal) Routing Levels 1234 xxx1 5712 xxx0 xxx3 xxx4 xxx5 xxx6 xxx7 xx02 5712 xx22 xx32 xx42 xx52 xx62 xx72 x012 x112 x212 x312 x412 x512 x612 5712 0712 1712 2712 3712 4712 5712 6712 7712 5712 012 3 45 6 7 0880 012 3 45 6 7 3210 012 3 45 6 7 4510 012 3 45 6 7 7510 012 3 45 6 7 Example: Octal digits, 2 12 namespace, 5712  7510

20 Dynamic Replica Placement Client c sends “join” request to statistically closest server s with object o through nearest representative server rs Naïve placement only checks s before placing new replicas while smart algorithm in addition checks parent, free siblings and free server children of s –Remaining capacity info piggybacked in soft-state messages If unsatisfied, s place new replica on one of the Tapestry path nodes from c to s (path piggybacked in the request) Naïve one always chooses the closest qualified node (to c), while smart one puts on the furthest qualified node

21 Without Capacity Constraints: Minimal Set Cover –Most cost-effective method: Greedy algorithm [Grossman & Wool 1994] With Capacity Constraints: Variant of Capacitated Facility Location Problem –C demand locations, S locations to build facilities, the capacity installed at each location is an integer multiple of u –Facility building cost f_i, service cost c_ij –Objective function: minimize sum of f_i and c_ij –Mapped to our problem: f_i = 1 and c_ij = 0 if location i cover demand j and infinity otherwise Static Replica Placement

22 Static Replica Placement Solutions Best theoretical one: use Primal-dual schema and Lagrangian relaxation [Jain & Varizani 1999] –Approximation ratio 4 Variant of greedy algorithm –Approximation ratio ln|S| –Choose s with the largest value of min(cardinality |C_s|, remaining capacity RC_s) –If RC_s < |C_s|, choose those least-covered clients to cover first With Global IP topology vs. with Tapestry overlay path topology only

23 Soft State Tree Maintenance Bi-directional Messaging –Heartbeat message downstream & refresh message upstream Scalability –Each member only maintains states for direct children+parent –“Join” request can be handled by any member Fault-tolerance through “rejoining”

24 Performance Under Various Conditions With 100, 1000 or 4500 clients With or without load constraints Od_smart performs consistently better than od_naïve and close to IP_s as before Load constraint can avoid hot spot

25 Performance Under Various Conditions II With 2500 random d-tree servers Merit of scalability: maximal number of server children is very small compared with total number of servers –Due to the randomized and “search with locality” properties of Tapestry

26 data plane network plane data source replica root server server client Tapestry mesh c s parent sibling server child surrogate cache Tapestry overlay path client child first placement choice parent candidates


Download ppt "Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department."

Similar presentations


Ads by Google