Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge
Advertisements

Date August 7, 2008 Presenter Marty Turner
Finding a needle in Haystack Facebook’s Photo Storage
Centrifuge: Integrated Lease Management and Partitioning for Cloud Services Atul Adya,John Dunagan*,Alec Wolman* Google, *Microsoft Research 1 7th USENIX.
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
A Survey of Web Cache Replacement Strategies Stefan Podlipnig, Laszlo Boszormenyl University Klagenfurt ACM Computing Surveys, December 2003 Presenter:
1 Sizing the Streaming Media Cluster Solution for a Given Workload Lucy Cherkasova and Wenting Tang HPLabs.
Squirrel: A peer-to-peer web cache Sitaram Iyer Joint work with Ant Rowstron (MSRC) and Peter Druschel.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
Storage management and caching in PAST, a large-scale, persistent peer- to-peer storage utility Antony Rowstron, Peter Druschel.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Latency-sensitive hashing for collaborative Web caching Presented by: Xin Qi Yong Yang 09/04/2002.
1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Anycast Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
Pastry And Squirrel Presented by Eirik T. Laberg Håvard Semundseth Orri G. Pálsson.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
1 Web Content Delivery Reading: Section and COS 461: Computer Networks Spring 2007 (MW 1:30-2:50 in Friend 004) Ioannis Avramopoulos Instructor:
Web Caching Schemes For The Internet – cont. By Jia Wang.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Test Review. What is the main advantage to using shadow copies?
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
On the Scale and Performance of Cooperative Web Proxy Caching University of Washington Alec Wolman, Geoff Voelker, Nitin Sharma, Neal Cardwell, Anna Karlin,
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
World Wide Web Caching: Trends and Technologys Gerg Barish & Katia Obraczka USC Information Sciences Institute, USA,2000.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Peer-to-Peer Supported Cache System for File Transfer Joonbok Lee
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
Squirrel: A decentralized peer-to- peer web cache Paper by Sitaram Iyer, Antony Rowstron and Peter Druschel (© 2002) Presentation* by Alexander Prohaska.
Peer to Peer Network Design Discovery and Routing algorithms
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
1 COMP 431 Internet Services & Protocols HTTP Persistence & Web Caching Jasleen Kaur February 11, 2016.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
Peer-to-Peer Networks 05 Pastry Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
System Models Advanced Operating Systems Nael Abu-halaweh.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Antony Rowstron, Microsoft Research Cambridge, UK
Magdalena Balazinska, Hari Balakrishnan, and David Karger
Accessing nearby copies of replicated objects
ECE 671 – Lecture 16 Content Distribution Networks
Early Measurements of a Cluster-based Architecture for P2P Systems
Small-Scale Peer-to-Peer Publish/Subscribe
Presentation transcript:

Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002 / Sitaram Iyer / Tuesday July 23 / Monterey, CA

Web Caching 1.Latency, 2.External traffic, 3.Load on web servers and routers. Deployed at: Corporate network boundaries, ISPs, Web Servers, etc.

Centralized Web Cache Web Cache Browser Cache Web Server Browser Cache Client InternetCorporate LAN

InternetCorporate LAN Cooperative Web Cache Browser Cache Web Server Browser Cache Client Web Cache

Internet Decentralized Web Cache Browser Web Server Browser Cache Client Corporate LAN Browser Cache Squirrel

Distributed Hash Table Peer-to-peer location service: Pastry Completely decentralized and self-organizing Fault-tolerant, scalable, efficient Operations: Insert(k,v) Lookup(k) k6,v6 k1,v1 k5,v5 k2,v2 k4,v4 k3,v3 node s Peer-to-peer routing and location substrate

Why peer-to-peer ? 1.Cost of dedicated web cache No additional hardware 2.Administrative effort Self-organizing network 3.Scaling implies upgrading Resources grow with clients

Setting Corporate LAN ,000 desktop machines Located in a single building or campus Each node runs an instance of Squirrel Sets it as the browsers proxy

Mapping Squirrel onto Pastry Two approaches: Home-store Directory

Home-store model client home LAN Internet URL hash

Home-store model client home …thats how it works!

Directory model Client nodes always cache objects locally. Home-store: home node also stores objects. Directory: the home node only stores pointers to recent clients, and forwards requests.

Directory model client home Internet LAN

Directory model client home Randomly choose entry from table

Directory: Advantages Avoids storing unnecessary copies of objects. Rapidly changing directory for popular objects seems to improve load balancing. Home-store scheme can incur hotspots.

Directory: Disadvantages Cache insertion only happens at clients, so: active clients store all the popular objects, inactive clients waste most of their storage. Implications: 1.Reduced cache size. 2.Load imbalance.

Directory: Load spike example Web page with many embedded images, or Periods of heavy browsing. Many home nodes point to such clients! Evaluate …

Trace characteristics Microsoft in : Redmond Cambrid ge Total duration1 day31 days Number of clients36, Number of HTTP requests16.41 million0.971 million Peak request rate606 req/sec186 req/sec Number of objects5.13 million0.469 million Number of cacheable objects 2.56 million0.226 million Mean cacheable object reuse 5.4 times3.22 times

Total external traffic Directory Home-store No web cache Centralized cache Redmond [lower is better] Per-node cache size (in MB) Total external traffic (GB)

Total external traffic Total external traffic (GB) [lower is better] Per-node cache size (in MB) Directory Home-store No web cache Centralized cache Cambridge

LAN Hops 0% 20% 40% 60% 80% 100% Total hops within the LAN Redmond CentralizedHome-storeDirectory % of cacheable requests

LAN Hops 0% 20% 40% 60% 80% 100% % of cacheable requests CentralizedHome-storeDirectory Cambridge Total hops within the LAN

Load in requests per sec Number of times observed Max objects served per-node / second Home-store Directory Redmond

Load in requests per sec e+06 1e Number of times observed Max objects served per-node / second Home-store Directory Cambridge

Load in requests per min Number of times observed Max objects served per-node / minute Home-store Directory Redmond

Load in requests per min Number of times observed Max objects served per-node / minute Home-store Directory Cambridge

Fault tolerance Sudden node failures result in partial loss of cached content. Home-store:Proportional to failed nodes. Directory:More vulnerable.

Fault tolerance Home-storeDirectory Redmond Mean 1% Max 1.77% Mean 1.71% Max 19.3% Cambrid ge Mean 1% Max 3.52% Mean 1.65% Max 9.8% If 1% of Squirrel nodes abruptly crash, the fraction of lost cached content is:

Conclusions Possible to decentralize web caching. Performance comparable to a centralized web cache, Is better in terms of cost, scalability, and administration effort, and Under our assumptions, the home- store scheme is superior to the directory scheme.

Other aspects of Squirrel Adaptive replication –Hotspot avoidance –Improved robustness Route caching –Fewer LAN hops

Thanks.

(backup) Storage utilization Redmond Home-storeDirectory Total MB61652 MB Mean per-node 2.6 MB1.6 MB Max per-node 1664 MB

(backup) Fault tolerance Home-storeDirectory Equations Mean H/O Max H max /O Mean (H+S)/O Max max(H max,S max )/O Redmond Mean % Max % Mean 0.198% Max 1.5% Cambridge Mean 0.95% Max 3.34% Mean 1.68% Max 12.4%

(backup) Full home-store protocol server client other req home req a : object or notmod from home b : object or notmod from origin 3 1 b 2 (WAN) (LAN) origin b : req

(backup) Full directory protocol dir server e : cGET req origin other req home req client req 2 b : not-modified 3 e c,e : req c,e : object 1 4 a, d 2 a, d : req 1 a : no dir, go to origin. Also d not-modified object or dele- gate

(backup) Peer-to-peer Computing Decentralize a distributed protocol: – Scalable – Self-organizing – Fault tolerant – Load balanced Not automatic!!

Decentralized Web Cache Browser Cache Browser Cache Web Server LAN Internet

Challenge Decentralized web caching algorithm: Need to achieve those benefits in practice! Need to keep overhead unnoticeably low. Node failures should not become significant.

Peer-to-peer routing, e.g., Pastry Peer-to-peer object location and routing substrate = Distributed Hash Table. Reliably maps an object key to a live node. Routes in log 16 (N) steps (e.g. 3-4 steps for 100,000 nodes)

Home-store is better! Simpler home-store scheme achieves load balancing by hash function randomization. Directory scheme implicitly relies on access patterns for load distribution.

Directory scheme seems better… Avoids storing unnecessary copies of objects. Rapidly changing directory for popular objects results in load balancing.

Interesting difference Consider: – Web page with many images, or – Heavily browsing node Directory:many pointers to some node. Home-store:natural load balancing. Evaluate …

Fault tolerance Home-storeDirectory Redmond Mean % Max % Mean 0.2% Max 1.5% Cambrid ge Mean 0.95% Max 3.34% Mean 1.7% Max 12.4% When a single Squirrel node crashes, the fraction of lost cached content is: