Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Optimizations for Locality-Aware Structured Peer-to-Peer Overlays Jeremy Stribling Collaborators: Kris Hildrum John D. Kubiatowicz The First.
Topologically-Aware Overlay Construction and Server Selection Sylvia Ratnasamy, Mark Handly, Richard Karp and Scott Shenker Presented by Shreeram Sahasrabudhe.
1 Efficient and Robust Streaming Provisioning in VPNs Z. Morley Mao David Johnson Oliver Spatscheck Kobus van der Merwe Jia Wang.
SplitStream: High- Bandwidth Multicast in Cooperative Environments Monica Tudora.
SCRIBE A large-scale and decentralized application-level multicast infrastructure.
MQ: An Integrated Mechanism for Multimedia Multicasting By De-Nian Yang Wanjiun Liao Yen-Ting Lin Presented By- Sanchit Joshi Roshan John.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Opportunities and Challenges of Peer-to-Peer Internet Video Broadcast J. Liu, S. G. Rao, B. Li and H. Zhang Proc. of The IEEE, 2008 Presented by: Yan Ding.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
ICNP'061 Benefit-based Data Caching in Ad Hoc Networks Bin Tang, Himanshu Gupta and Samir Das Computer Science Department Stony Brook University.
1 Caching/storage problems and solutions in wireless sensor network Bin Tang CSE 658 Seminar on Wireless and Mobile Networking.
Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.
Application Layer Multicast
ICNP'061 Benefit-based Data Caching in Ad Hoc Networks Bin Tang, Himanshu Gupta and Samir Das Department of Computer Science Stony Brook University.
Overlay Networks EECS 122: Lecture 18 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
1 An Overlay Scheme for Streaming Media Distribution Using Minimum Spanning Tree Properties Journal of Internet Technology Volume 5(2004) No.4 Reporter.
Object Naming & Content based Object Search 2/3/2003.
CS218 – Final Project A “Small-Scale” Application- Level Multicast Tree Protocol Jason Lee, Lih Chen & Prabash Nanayakkara Tutor: Li Lao.
Weaving a Tapestry Distributed Algorithms for Secure Node Integration, Routing and Fault Handling Ben Y. Zhao (John Kubiatowicz, Anthony Joseph) Fault-tolerant.
Dynamic Multicast Tree Construction in OceanStore Puneet Mehra and Satrajit Chatterjee Advanced Topics in Computer Systems Final Project EECS Department,
Adaptive Web Caching Lixia Zhang, Sally Floyd, and Van Jacob-son. In the 2nd Web Caching Workshop, Boulder, Colorado, April 25, System Laboratory,
Application Layer Multicast for Earthquake Early Warning Systems Valentina Bonsi - April 22, 2008.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
An Evaluation of Scalable Application-level Multicast Using Peer-to-peer Overlays Miguel Castro, Michael B. Jones, Anne-Marie Kermarrec, Antony Rowstron,
Clustering of Web Content for Efficient Replication Yan Chen, Lili Qiu, Wei Chen, Luan Nguyen and Randy H. Katz {yanchen, wychen, luann,
Quantifying Network Denial of Service: A Location Service Case Study Yan Chen, Adam Bargteil, David Bindel, Randy H. Katz and John Kubiatowicz Computer.
World Wide Web Caching: Trends and Technology Greg Barish and Katia Obraczka USC Information Science Institute IEEE Communications Magazine, May 2000 Presented.
1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Network Aware Resource Allocation in Distributed Clouds.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
Brocade Landmark Routing on P2P Networks Gisik Kwon April 9, 2002.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Energy-Efficient Shortest Path Self-Stabilizing Multicast Protocol for Mobile Ad Hoc Networks Ganesh Sridharan
1 More on Plaxton routing There are n nodes, and log B n digits in the id, where B = 2 b The neighbor table of each node consists of - primary neighbors.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Towards a Transparent and Proactively-Managed Internet Ehab Al-Shaer School of Computer Science DePaul University Yan Chen EECS Department Northwestern.
Efficient AOI-Cast for Peer-to-Peer Networked Virtual Environments.
KAIS T On the problem of placing Mobility Anchor Points in Wireless Mesh Networks Lei Wu & Bjorn Lanfeldt, Wireless Mesh Community Networks Workshop, 2006.
Energy-Conserving Data Placement and Asynchronous Multicast in Wireless Sensor Networks Sagnik Bhattacharya, Hyung Kim, Shashi Prabh, Tarek Abdelzaher.
Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.
Peer to Peer Network Design Discovery and Routing algorithms
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter : Lee Youn Do Oct 5, 2005 Ben Y.Zhao, John Kubiatowicz, and Anthony.
Efficient Resource Allocation for Wireless Multicast De-Nian Yang, Member, IEEE Ming-Syan Chen, Fellow, IEEE IEEE Transactions on Mobile Computing, April.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
NCLAB 1 Supporting complex queries in a distributed manner without using DHT NodeWiz: Peer-to-Peer Resource Discovery for Grids Sujoy Basu, Sujata Banerjee,
An overlay for latency gradated multicasting Anwitaman Datta SCE, NTU Singapore Ion Stoica, Mike Franklin EECS, UC Berkeley
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Architecture and Algorithms for an IEEE 802
A Study of Group-Tree Matching in Large Scale Group Communications
Accessing nearby copies of replicated objects
Dynamic Replica Placement for Scalable Content Delivery
Presentation transcript:

Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department UC Berkeley

Motivation Scenario data plane network plane data source Web content server CDN server client replica

Goal and Challenges Dynamically choose the number and placement of replicas while satisfying clients’ QoS and servers’ capacity constraints Good performance for update dissemination –Delay- Bandwidth consumption Without global network topology knowledge Scalability for millions of objects, clients and servers Provide content distribution to clients with good Quality of Service (QoS) while retaining efficient and balanced resource consumption of the underlying infrastructure

Outline Goal and Challenges Previous Work Our Solutions: Dissemination Tree Peer-to-peer Location Service Replica Placement and Tree Construction Evaluation Conclusion and Future Work

Previous Work Focused on static replica placement –Clients’ distributions and access patterns known in advance –Assume global IP network topology DNS-redirection based CDNs highly inefficient –Centralized CDN name server cannot record replica locations No inter-domain IP multicast Application-level multicast (ALM) unscalable –Root maintains states for all children or handle all “join” requests

Solution: Dissemination Tree Dynamic replica placement use close-to-minimum number of replicas to satisfy QoS and capacity constraints with local network topology data plane network plane data source root server client replica always update adaptive coherence cache Adaptive cache coherence for efficient coherence notification Tapestry mesh Use Tapestry location service to improve the scalability & locality

Peer-to-peer Routing and Location Services Properties Needed by d-tree –Distributed, scalable location with guaranteed success –Search with locality P2P Routing and Location Services: –CAN, Chord, Pastry, Tapestry, etc.

Integrated Replica Placement and d-tree Construction Dynamic Replica Placement + Application-level Multicast –Naïve approach- Smart approach Static Replica Placement + IP Multicast –Modeled as capacitated facility location problem –Design a greedy algorithm with logN approximation –Optimal case for comparison Soft-state Tree Maintenance

parent candidate data plane network plane c s surrogate Tapestry overlay path Dynamic Replica Placement: naïve

data plane network plane c s surrogate Tapestry overlay path first placement choice parent candidate Dynamic Replica Placement: naïve

Dynamic Replica Placement: smart parent candidates Aggressive search data plane network plane c s parent sibling server child surrogate Tapestry overlay path client child Greedy load distribution

Dynamic Replica Placement: smart Aggressive search Lazy placement Greedy load distribution data plane network plane c s parent sibling server child surrogate Tapestry overlay path client child first placement choice parent candidates

Evaluation Methodology Network Topology –5000-node network with GT-ITM transit-stub model –500 d-tree server nodes, 4500 clients join in random order Network Simulator –NS-like packet-level priority-queue based event simulator Dissemination Tree Server Deployment –Random d-tree –Backbone d-tree (choose backbone routers and subnet gateways first) Constraints –50 ms latency bound and 200 clients/server load bound

Performance of Dynamic Replica Placement Compare Four Approaches –Overlay dynamic naïve placement (dynamic_naïve) –Overlay dynamic smart placement (dynamic_smart) –Static placement on overlay network (overlay_static) –Static placement on IP network (IP_static) Metrics –Number of replicas deployed and load distribution –Dissemination multicast performance –Tree construction traffic

Number of Replicas Deployed and Load Distribution Overlay_smart uses much less replicas than overlay_naïve and very close to IP_static Overlay_smart has better load distribution than od_naïve, overlay_static and very close to IP_static

Multicast Performance 85% of overlay_smart Relative Delay Penalty (RDP) less than 4 Bandwidth consumed by overlay_smart is very close to IP_static and much less than overlay_naive

Tree Construction Traffic Including “join” requests, “ping” messages, replica placement and parent/child registration Overlay_smart consumes three to four times of traffic than overlay_naïve, and the traffic of overlay_naïve is quite close to IP_static Far less frequent event than update dissemination

Conclusions and Future Work Dissemination Tree: dynamic Content Distribution Network on top of a peer-to-peer location service –Dynamic replica placement satisfy QoS and capacity constraints and self-organize into an app-level multicast tree –Use Tapestry to improve the scalability and locality Simulation Results Show –Close to optimal number of replicas, good load distribution, low multicast delay and bandwidth penalty at the price of reasonable construction traffic Future Work –Evaluate with more diverse topologies and workload –Dynamic replica deletion/migration to adapt to the shift of users’ interests

Routing in Detail Neighbor Map For “5712” (Octal) Routing Levels 1234 xxx xxx0 xxx3 xxx4 xxx5 xxx6 xxx7 xx xx22 xx32 xx42 xx52 xx62 xx72 x012 x112 x212 x312 x412 x512 x Example: Octal digits, 2 12 namespace, 5712  7510

Dynamic Replica Placement Client c sends “join” request to statistically closest server s with object o through nearest representative server rs Naïve placement only checks s before placing new replicas while smart algorithm in addition checks parent, free siblings and free server children of s –Remaining capacity info piggybacked in soft-state messages If unsatisfied, s place new replica on one of the Tapestry path nodes from c to s (path piggybacked in the request) Naïve one always chooses the closest qualified node (to c), while smart one puts on the furthest qualified node

Without Capacity Constraints: Minimal Set Cover –Most cost-effective method: Greedy algorithm [Grossman & Wool 1994] With Capacity Constraints: Variant of Capacitated Facility Location Problem –C demand locations, S locations to build facilities, the capacity installed at each location is an integer multiple of u –Facility building cost f_i, service cost c_ij –Objective function: minimize sum of f_i and c_ij –Mapped to our problem: f_i = 1 and c_ij = 0 if location i cover demand j and infinity otherwise Static Replica Placement

Static Replica Placement Solutions Best theoretical one: use Primal-dual schema and Lagrangian relaxation [Jain & Varizani 1999] –Approximation ratio 4 Variant of greedy algorithm –Approximation ratio ln|S| –Choose s with the largest value of min(cardinality |C_s|, remaining capacity RC_s) –If RC_s < |C_s|, choose those least-covered clients to cover first With Global IP topology vs. with Tapestry overlay path topology only

Soft State Tree Maintenance Bi-directional Messaging –Heartbeat message downstream & refresh message upstream Scalability –Each member only maintains states for direct children+parent –“Join” request can be handled by any member Fault-tolerance through “rejoining”

Performance Under Various Conditions With 100, 1000 or 4500 clients With or without load constraints Od_smart performs consistently better than od_naïve and close to IP_s as before Load constraint can avoid hot spot

Performance Under Various Conditions II With 2500 random d-tree servers Merit of scalability: maximal number of server children is very small compared with total number of servers –Due to the randomized and “search with locality” properties of Tapestry

data plane network plane data source replica root server server client Tapestry mesh c s parent sibling server child surrogate cache Tapestry overlay path client child first placement choice parent candidates